Imagine being able to predict when a war will break out. Well, scientists are trying, and they’re getting awfully close.
In 2010, when the WikiLeaks scandal broke, a group of friends gathered together at Bitly Inc in New York to try to bring order to the chaotic mess of data WikiLeaks had unleashed.
They used a simple code to extract dates and locations from about 77,000 unclassified reports from both simple stop-and-search operations and battles. This revealed unexpected hot spots: the Pakistani border, and the country’s main highway, which had experienced a series of violent outbreaks.
Recent advances in big data technologies have shown that we’re close to being able to predict exactly when a battle will break out. However there’s one thing that will stymie advances in this field: the inherent unpredictability of humans.
This year, when big data became a buzzword, the team reunited at Bitly’s offices to give the project another shot. This time, they teamed up with the brightest mathematical minds for a more audacious goal: a war zone prediction model. The researchers found a general pattern to the violence in Afghanistan, using it to determine whether an uprising would take place in each province, and its level of intensity.
The model worked with surprising accuracy and didn’t fail even when President Obama changed the rules of the game by sending in 30,000 additional troops.
The project, the results of which were published online by the Proceedings of the National Academy of Sciences in July, is just one small part of a growing movement to anticipate episodes of armed conflict using algorithmic computational techniques. Still, we have a long road ahead of us before this data is turned into actionable intelligence — a matter of life or death on the battlefield.
As Lt. Gen. Michael Oates, head of the Joint Improvised Explosive Device Organization, recently stated, “There is no shortage of data. There is a dearth of analysis.”
Iraq: the first ‘big data war’
Bitly’s research wasn’t the first time a group of renegade scientists brought the power of analytics to a war zone. For the better part of the decade, bringing big data to the battlefield has been the job of civilian researchers.
Cast your mind back to the spring of 2003, when four countries participated in the invasion of Iraq and succeeded in toppling the regime of Saddam Hussein in 21 days. At Oxford University, a young scholar had a theory, one that did not sit very well with centuries of political theory. He wondered whether wars share a single, predictable pattern.
Sean Gourley (pictured), a New Zealand-born graduate student, told me that “during one of those classic Oxford dinner conversations where you sit around these high tables, Harry Potter-style” he butted heads with James Woolsey, the former director of the CIA.
“I had a hunch that there might be some strong, mathematical pattern that might emerge once we’d looked at Iraq,” he said. “No one had really done it before.”
Using a combination of reports from 130 news sources, SMS-based communications between freelance journalists and photographers stationed in Baghdad, plus any information from the frontlines he could get his hands on, he set to work on an algorithm.
“We were writing software to extract when people were dying and how they were dying …”
Gourley told me that he harbored strong reservations about the research. “My God … we were writing software to extract when people were dying and how they were dying,” Gourley said, admitting that he frequently considered throwing in the towel.
To make matters more complicated, reports from various news media often conflicted. As we know now, war reporting is notoriously inaccurate and is even less viable when comparing one armed conflict to another. In Afghanistan, for example, rural environs and a depleted number of reporters on the field led to less coverage than in the Iraq conflict. This remains a problem for researchers today.
Gourley told me they learned that the best approach was for humans and algorithms had to work hand-in-hand. By then, a team of physicists were working with him to see him through this crazy experiment, and they were investigating other conflict areas, including Sierra Leone. According to Gourley, the senior-level military personnel were keeping a watchful eye on their progress.
However, his frequent attempts to convince a contact at the Pentagon to hand over data fell on deaf ears. As a foreign national, he was not able to access official U.S. military reports.
This proved to be a blessing in disguise. At that time, the military’s analysts were typically trained in political science, not computer science, and their reports were spotty at best. “They couldn’t write a python script if you paid them,” Gourley joked. After the WikiLeaks scandal, he discovered that his datasets were superior to the U.S. military’s. “It turns out that we had 80 percent of what they had; they only had 70 percent of what we had,” he explained.
The results, published in Nature in 2009, found that insurgent wars follow an approximate power law, in which the frequency of attacks decreases with increasing attack size to the power of 2.5. That means that for any insurgent war, an attack with 10 casualties is 316 times more likely to occur than one with 100 casualties.
This may seem like a boring set of numbers, but for the first time, it revealed an underlying pattern to war. “It shows that there is something going on in the way these wars are fought that is common to all,” Neil F. Johnson, a physicist at the University of Miami who participated in the research, told Nature.
In the course of his research, he and his team collected data on 54,679 “violent events” reported in nine different conflicts, including those in Iraq, Afghanistan, Peru, and Colombia.
What’s next for the big data wars?
The Gourley and Bitly research are first steps in bringing objective quantitative analysis to realms that were once subjective. Big data will play a growing role in maintaining global security as the Department of Defense reshuffles budgets and priorities. According to Forbes, the amount of data from drones and other surveillance technology has risen 1,600 percent since 9/11.
To step up the research, the U.S. military recently made a $250 million bet on big data. In May, U.S. Secretary of Defense, Leon Panetta put forward a review of the country’s national defense, spotlighting information processing as a growing priority.
For Gourley, who ultimately left academia to form big data startup Quid, this research will have far-reaching consequences, and not just for the military. If these algorithms work, they may change the very nature of war.
Key Image via Shutterstock
VentureBeat’s VB Insight team is studying email marketing tools.
Chime in here, and we’ll share the results