Geopolitics and Big Data – A Tale of Two Worlds
We live in an age of contradictions. Depending on where you live, it’s either a time of unprecedented prosperity or crushing poverty. Your country may be enjoying peace and stability or it’s suffering from high unemployment and civil unrest. In fact, the world around us is changing so rapidly we barely have time to catch our breath. Geopolitical conflicts are reshaping our world as we know and knew it.
With so much emphasis on the uses of Big Data these days, it should come as no surprise that governments, corporations, and global organizations are turning a hopeful eye toward predictive analytics to construct a catalog of human societal-scale behavior and beliefs across all countries of the world; connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what’s happening around the world, what its context is and who’s involved, and how the world is feeling about it, every-single-day.
Big Data and Global Events
The Global Database of Events, Languages, and Tones (GDELT), is an open-source online tool with over 250 million worldwide events logged since 1979, providing an easy-to-access platform to monitor global news at the scale of Big Data. Simply, it makes world conflict (and other geopolitical events) computational, offering new sets of data that could hold the key to predictive conflict analysis in the future.
GDELT was designed to use text-analysis and algorithms that collect information from news media sites in over 100 languages around the world – placing them into a single database with daily reports. The database breaks events into 59 different categories and geolocates them. By plugging the data into algorithmic forecasts, it can then create a predictive model of the global landscape at any given time.
What GDELT essentially does is monitor public information internationally through news media. One of its unforeseen results so far is to expose how biases emerge in the news. For example, while the Malaysian Flight MH17 incident in Ukraine dominated Western media, the Gaza conflict interested the rest of the world.
On close inspection GDELT is a dataset, available on Google BigQuery – a scalable, interactive ad hoc query system for analysis of read-only nested data. There are additional plans for a live map where users can intuitively view (in real time) emergent events as they unfold all over the world. While news media may not be the perfect insight into living history, with its tendency for bias and errors, it does offer a peek into the emotional motivations behind news events along with geographically pinpointing emerging zones of conflict as they unfold. The public information that people in every corner of the globe put out every day provides an enormous wealth of indicators that reveal how populations are feeling, what’s important and what the touch point issues are. It takes that information and intelligently creates a picture of society as a whole.
Analytics is Fancy Guesswork
Completely accurate forecasting will always be a challenge. There’s a ton of work being done on very simplistic algorithms used as predictive forecasting models that yield very useful data and forecasts. By relying on more quantifiable data like GDP, infant mortality, or factual things like total numbers of protests, traditional forecasting ignores more qualitative measurements like emotion.
If one looks at things like the thematic and emotional dimensions, what you start seeing is the huge amount of signals that precede a collapse. For example, pro-Russian chatter in Crimea started to appear in GDELT during the protests in Kiev, long before the Russian invasion. However, it’s hard to imagine we’ll ever get to a point where a predictive map says “There’s going to be a riot at 5:15 pm next Friday.” What it can tell us with some degree of accuracy is: “Hey, you might want to take a look at this.”
Other Analytics Tools
GDELT isn’t the only database pooling information about conflict and news media through a data lens, with an eye for predictive forecasts. The Defense Advanced Research Projects Agency (DARPA) has its Worldwide Integrated Crisis Early Warning System (W-ICEWS), developed by Lockheed Martin, which pools from 17 million news stories with “aggregate forecast accuracy of more than 80 percent.” Having computable and reliable intelligence informing government decision-makers on questions of where to allocate resources represents an enormous strategic advantage in the future, especially if these models become more accurate.
But computers aren’t perfect. GDELT makes mistakes computing sarcasm or noticing editorial skew. For example, in the case of the downed Flight MH17 civilian airliner in Ukraine, the finger-pointing from Russian and Ukrainian news sources distorted the story in GDELT. That being said other input, like mixing the algorithmic forecasting with more scientific data like water availability or climate issues, could potentially improve the accuracy of dealing with global events.
Eventually, it’s not far-fetched to imagine a future where the entire world is computationally structured: a single computer model blending earth data and human information to produce predictive data on conflict. In that world, we’ll hopefully see wars coming before they turn into humanitarian disasters.
Whether or not we act on those predictive forecasts is another question. How we use the information we collect is ultimately just as important as how we gather it. No matter what the prediction, or its outcome, we are in the end absolutely and most decidedly our own destiny.
Dr. Chuck Kulig