IBM has teamed up with university researchers to use big data and analytics to predict the outbreak of deadly diseases such as dengue fever and malaria.
The research is aimed at understanding the spread of diseases in real-time in order to better deploy public health resources, said James Kaufman, public health manager at IBM Research in the IBM Almaden Research Center in San Jose, Calif.
But rather than just predicting the spread of a disease, the researchers at IBM, Johns Hopkins University, and the University of California at San Francisco are applying analytics from large data sets to see how changes in rainfall, temperature, and even soil acidity can dramatically affect the populations of wild animals and insects that carry the diseases. They’re also merging that information with other data, like airport and highway traffic, to further understand outbreaks.
To do this, IBM created an open-source modeling application dubbed the Spatio Temporal Epidemiological Modeler (STEM) that allows any kind of data to be quickly combined and correlated with disease data. The research could be important in understanding dengue fever, which is hitting places in Texas and Florida, thanks to the spread of mosquitos. The disease was once thought to be limited to the tropics or developing countries, but it’s showing up all over the world. Part of the reason is the rise of global transportation, trade, and climate change. Dengue fever has spread to more than 100 countries, and malaria is still responsible for a million deaths a year.
Analytics has been used to predict what you’ll buy next on Amazon or whether you’ll pay for an item in games like FarmVille. But it’s also useful in fields such as public health and disease research. Some parallels in the real world: Walmart uses sales data in its stores to predict when flu season starts and when to put more medicine on the shelves. And Google is able to predict the flu based on searches that people do about flu symptoms.
“We want to make it easier for people to gather and use data for research with open-source tools,” Kaufman said. “The thing that makes the greatest difference in the ability to make predictions about disease is the quality of your data. But most often there is a limitation in the data you have access to.”
In the past, it often took years to gather data from disease monitoring. But as the U.S. implements electronic health records across the country, it is becoming easier to access data in real-time, Kaufman said. And now that data can be crunched in the cloud, or web-connected data centers, and analyzed quickly. With STEM, scientists used population analytics, algorithms of disease paths, and powerful computing to build realistic and accessible models of these infectious diseases.
“Public health officials can’t afford to act on speculation during an epidemic. They need accurate and timely access to data to see what the potential spread of a disease might be for a given geographic region over a period of time,” said Kaufman.
With malaria, the researchers used the model and data from the World Health Organization. They were able to see how changes in local climate and temperature affected the spread of the disease. Now they can use that data to figure out where the next outbreaks will be. STEM is free and open to any scientist who chooses to build on its foundation in an open way. STEM 2.0 will be available on Oct. 15 through the Eclipse Foundation.
“There are a lot of tacit assumptions out there about how changes in climate will impact the distribution of diseases like malaria. This work suggests that things probably are not so simple. A change that has a huge effect on malaria transmission in one place might not be as important somewhere else,” said Justin Lessler of Johns Hopkins Bloomberg School of Public Health. “One of the nice things about open source projects like STEM is that now whoever wants to can download the model and start tweaking it, seeing if their own data or assumptions fundamentally change the results.”
“It is important to recognize the synergistic effort of theoretical and computational scientists, disease experts, and public health officials making a difference in how rapidly and effectively we fight infectious diseases,” said Simone Bianco of UC San Francisco’s Bioengineering and Therapeutic Sciences. “We have to be ready at the drop of a hat to parse through disparate data from global disease surveillance systems, conduct computationally intense research, and transfer our knowledge to public health officials to help them visualize population health, detect outbreaks, develop new models, and evaluate the effectiveness of policies.”
The research was published in the peer-reviewed journals Malaria and Theoretical Biology.