Data Farming Cultivates New Insights

June 2005
By Maryann Lawlor
E-mail About the Author

Research into how to improve warfighting techniques is taking
place at John Hopkins University’s new Warfare Analysis Laboratory where interactive analytical tools support a distributed collaborative environment. With the help of industry and academia, the U.S. Marine Corps Warfighting Laboratory’s Project Albert is exploring data farming, which will provide analysts with additional insights.
Project leverages small models to explore how random factors affect operational outcomes.

Predicting a volatile enemy’s next move is still the bailiwick of soothsayers, but technology may help future commanders choose an appropriate counteraction. Using the findings of a research project that delves into data farming, the U.S. Marine Corps and industry are introducing tools that help warfighters better understand the virtually infinite possibilities in the battlespace. The capability could assist in finding ways not only to defeat a martyrdom-based adversary but also to prevent this enemy from growing its ranks.

Research into this capability is a congressionally mandated effort chartered in 1998. Called Project Albert, it includes support from the Maui High Performance Computing Center and Referentia Systems Incorporated, both in Hawaii, The Mitre Corporation, Northrop Grumman, the Naval Postgraduate School, the Naval Academy and others all under the direction of the Marine Corps Warfighting Laboratory, Quantico, Virginia. The project is named after Albert Einstein.

Dr. Gary Horne, director of Project Albert and founding member of the data farming team for Referentia Systems, explains that the work focuses on using small abstract models to capture the essence of a military question. Although the models are abstract, the queries are very much a part of the real world. They include determining when it is advantageous to employ decentralized rather than centralized command and control capabilities, how to mitigate a bio-terrorist attack in a free society and what characteristics are important in military convoy protection systems.

Because the models are small, they can be processed very quickly using high-performance computing capabilities. “You get a dynamic combination because you can look at literally thousands, tens of thousands, hundreds of thousands, even millions of runs. You can vary the parameters, which are numerous because in today’s uncertain world, you’re up against so many different factors. You can’t really predict anything, but if you look at enough possibilities, you can begin to understand,” Horne says.

And understanding the landscape, the Project Albert team has learned from experience, is the only achievable goal. Initially, Horne believed that by using supercomputers and small models, the team would be able to create a complete picture. “Well, the answer is no. You can never cover the landscape because you’re up against virtual infinity,” he explains.

The Project Albert team uses data farming to understand the landscape. This technique differs from data mining because it does not just dig through data; it actually uses data to grow new insights. For example, models can be set up to examine the effects of different communications system configurations: networks versus hierarchical communication designs. Horne points out that although many would believe the networked environment would result in a better outcome, the team found that this is not always the case.

To explore some of the billions of possible outcomes, the models are run using different parameters and values, but Horne quickly points out that the researchers also run the same set of parameters with different seed values to understand the distribution of results over the same parameter set. The seed varies a random aspect of the situation that may be considered totally insignificant, such as where a warfighter was standing when combat began. The change may be slight—positioned one grid square to the left or right.

Project Albert engaged history students to examine the consequences of changing seed values. In its experimental history program, models were built to mimic the Battle of Midway. Using the exact same parameters, the United States won the battle only one out of 100 times. This demonstrates how much success in an operation can depend on what Horne calls “the right side of chance” because a small change can result in a loss rather than a win. “When you think about that, it’s both scary and enlightening that in history you only get one chance,” he notes.

Horne admits that when he first started Project Albert eight years ago, team members believed they could determine all the possible outcomes given the characteristics of the forces and operations. But he soon changed his mind. “That’s one of the evolutions of this project. We know we are up against virtual infinity, and we probably should have known that since November 9, 1989, because, when the Berlin Wall fell, we should have known that it’s not just one gigantic force-on-force but there are so many different factors that we could be up against,” he states.

It took the terrorist attacks of September 11, 2001, to punctuate how uncertain the world had become, Horne notes. “How do we handle uncertainty? How do we give our decision makers advice that can be based on the fact that we really don’t know anything for sure? Do we give up? Or do we try something different?” he asks.

Project Albert is an attempt to try something different, Horne says. Data farming turns operations research on its head. Instead of putting additional detail into the models, it reduces the details, yet captures the essence of the question. “You can grow data in interesting regions then add parameters and components that are important. After that’s done, we run it and that’s the whole metaphor of data farming. You’re not just growing data—a million pieces of data—and mining it. You are going back in and growing more data,” he explains.

Understanding the battlespace landscape is only one product of data farming, and Horne is even more excited by a second product: outliers. An outlier is a data point that is totally different from the vast majority of the data, and Horne treasures these like gems.

He learned the value of outliers the hard way. In 1986, long before Project Albert began, Horne was an analyst for a naval exercise where he was collecting data about the amount of time it took to offload cargo from ships. While the process took approximately three hours for most of the barges and crews, in one case the offloading lasted 12 hours. Following data analysis standard operating procedure, Horne discarded the one anomaly. “That’s what we were taught, and that’s what we did. But that’s not what Project Albert does. That’s not what I do as the lead founding member of data farming. I take that outlier and I cherish it. I hold it in my hand, and I look at it like a jewel because outliers can lead us. Outliers can tell us things that nothing else can,” he says.

West Point cadets participate in a Project Albert workshop in February.
The project sponsors both local and international workshops to examine the potential of data farming.
He knows this because one year later, while collecting data for the same naval exercise, the anomaly occurred again. This time, Horne investigated the cause and found that one crew had placed one of the containers so close to the front of the barge that the rough terrain cargo handler could not maneuver to offload it. As a result, the barge had to return to the ship to be reloaded—a process that took 12 hours.

“If we had cherished that outlier, if we had put that in the database, if we had documented it well instead of throwing it away, maybe the guys would have known not to put the container so close to the front of the barge,” he explains.

The significance of outliers was confirmed again 10 years later at the Marine Corps Air Ground Combat Center in 29 Palms, California, but this time, it taught Horne a different valuable lesson. He was working on simple models with Capt. Tom Eipp, USMC, prior to the hunter warrior exercise in March 1997. The objective of the exercise was to test the veracity of hunter-killer teams against a large mechanized force.

Using an abstract model and what Horne and Capt. Eipp believed were reasonable parameters, Horne ran the simulation and the blue team lost. The test was repeated, using the same parameters but a different data seed. The blue team lost again. After repeating the run numerous times, the results were the same. “Now here’s the punch line. The 39th time we ran it, blue actually won. So at the end of the day, there I am sitting with the captain, and we looked at each other and I said, ‘I give up. We are sending 12,000 people out there and spending millions of dollars for this particular concept. We’ve got to tell someone. We only won one out of 100 times.’ Capt. Eipp looked at me and said, ‘Dr. Horne, you just don’t get it.’ He said it very politely. ‘I am a Marine, and if we can win in silicon one out of 100 times, I guarantee you, Dr. Horne, I get on the battlefield; my fellow Marines get on the battlefield. We will win, because the odds aren’t one out of 100. When we go out there for the real thing, it’s not going to be the 99. It’s going to be the one. Well, talk about motivating!” Horne relates.

It was this passion that Horne brought to Project Albert when it began a year later. His experiences in the 1980s and the lesson he learned about dedication to duty 10 years later demonstrated that data anomalies would help commanders understand the possibilities, and in some ways the odds, in planning operations. Horne stresses that predicting the outcome of a mission is not possible and the purpose of the modeling capability is not to recommend tactics to a commander for a specific mission. However, it can provide commanders with some insights such as that one course of action could result in three possible outcomes: one 10 percent of the time, one 20 percent of the time and one 70 percent of the time.

It is precisely the overabundance of possibilities that poses the biggest challenge for Project Albert, Horne admits. Millions of pieces of data can be banked, but translating those pieces into understanding or even locating one piece for a commander at a critical moment is a problem that the team is working on.

Despite this ongoing challenge, the Project Albert team continues to apply its techniques to help address some of the most difficult problems combat troops as well as homeland security agencies face today. For instance, team members worked with models to create a convoy-fanning tool to help Marines place electronic countermeasures in the most effective configuration to help counteract improvised explosive devices.

On a broader scale, Horne has been applying data farming to the global war on terrorism since the terrorist attacks. He was set to begin teaching a generic data-farming course in Stockholm, Sweden, in mid-September 2001. On September 12, 2001, the Project Albert team came together to revamp the course material so students could examine the issue of combating a martyrdom-based enemy.

Horne stresses that a common fallacy is that the security threat is from suicide bombers. “On September 12th, when I sat down with my team, I said, ‘We have a martyrdom-based enemy. That’s they’re mindset. We’re not up against suicide bombers. They’re martyrs. That’s what’s in their heads. I don’t care what’s in your head. You’ve got to know what’s in their heads, and in their heads they’re martyrs. If you confuse them with suicide bombers, you’re going to get it wrong,’” he states. Horne admits that although the right premise could still result in an erroneous conclusion, it is the best place to start.

Armed with revised lesson plans, Horne headed out on the first trans-Atlantic flight after the terrorist attacks. And when he began teaching the class, he found that he had a few lessons to learn as well. His students did not merely accept the distinction between suicide bombers and martyrs; they took the discussion a bit further.

“They weren’t interested just in how to combat or defend against this threat. They wanted to know how to keep it from growing. How do you stop a kid born today from becoming a martyr in the year 2021? Are we going to be faced with people who are going to be martyrs and going to have new technology, or can we stop the enemy from growing in numbers? And what can our data-farming techniques do to help us grow data and understand those possibilities? It is very complicated,” Horne admits.

As the final year of funding draws near, the Project Albert team is in the process of packaging the tools it has developed and examining the logical next step for the work. For instance, the researchers will explore whether data-farming insights could be used in real time in a tactical situation or if the value is only in narrowing parameter sets.

Horne says that experiments already have demonstrated that the tools help in data analysis, but the next step will be to determine whether they could also help commanders decide on a course of action. “We’re looking at some of that now to see if that’s viable. We would have to determine how to refine the techniques and whether the insights could be delivered in real time. We’re asking what-if questions. We ask about the human interface between the data and the decision maker. That’s the tough one, and we’re years away from finding a solution to that issue,” he states.


Web Resources
Project Albert:
Marine Corps Warfighting Laboratory, Project Albert:
The Mitre Corporation:


Enjoyed this article? SUBSCRIBE NOW to keep the content flowing.