New data sources fuel understanding of public health emergencies
- By Kathleen Hickey
- Sep 20, 2016
Remember when Google search results were first used to predict the flu?
Now, data from mobile phones, social media and even grocery scanners has been shown to be effective at identifying patterns in epidemics.
“Human mobility contributes significantly to epidemic transmission into new regions,” Frédéric Pivetta, of Real Impact Analytics, wrote in a blog post. Standard travel data collection methods, however, are limited and often provide outdated data.
Mobile phones, on the other hand, are nearly ubiquitous, and can serve as a rich data resource. Call data, which automatically provides time and location details, can help in understanding human mobility. Information on social networks (who communicates with whom) is also valuable, as is data on mobile spending, which can be used as an indicator of socioeconomic status.
“Aggregated and anonymized, mobile telecom data fills the public data gap without ... privacy issues,” Pivetta said. “Mixing it with other public data sources results in a very precise and reliable view on human mobility patterns, which is key for preventing epidemic spreads.”
Using this aggregate data, researchers can identify mobility patterns for each disease, determine epidemic incidence, build an epidemiological model, map epidemic risk flows and flag at-risk areas as well as prioritize and monitor public health measures.
In a similar initiative this summer, IBM announced a project to help track the spread of the Zika virus, using, among other things, sentiment analysis. The company’s researchers in California planned to train scientists from Brazil's Oswaldo Cruz Foundation (Fiocruz), an institution affiliated with the Brazilian Ministry of Health, to use the Spatiotemporal Epidemiological Modeler, an open source disease modeling application that visualizes the spread of infectious diseases. The application, called STEM, helps public health officials and epidemiologists analyze the effects of factors such as geography, weather, time and human travel patterns. It has been used to study and help predict the spread of infectious diseases like influenza and Ebola and mosquito-borne diseases such as malaria and dengue fever.
Part of the project involves analyzing public, Portuguese-language Twitter posts that discuss the incidence of Zika, dengue and chikungunya, as well as the appearance of the Aedes aegypti mosquito, the species mainly responsible for these illnesses.
After Fiocruz defines search parameters, IBM's Research Lab in Brazil will harvest and interpret anonymized data, which will allow Fiocruz to make recommendations directly to public health officials, IBM said. The company said it applied similar technology at the 2014 World Cup in Brazil, analyzing nearly 60 million social media posts.
A new twist in applying data to public health involves using grocery scanner data to speed investigations during an outbreak of a foodborne illness. Researchers demonstrated that analyzing data from retail scanners against maps of confirmed cases of foodborne illness can speed the identification of a contaminated food source.
“Our analysis shows that after receiving as few as 10 laboratory confirmed case reports, it is possible to narrow the investigation to approximately 12 suspect products” within a few hours, according to a report by scientists at IBM’s Almaden Research Center. By contrast, a traditional investigation can take weeks or even months.
“While traditional methods like interviews and surveys are still necessary, analyzing big data from retail grocery scanners can significantly narrow down the list of contaminants in hours for further lab testing,” said lead scientist Kun Hu.
Researchers analyzed such data points as geographic location, shelf life, likelihood of harboring a particular pathogen and and possible time of consumption for hundreds of grocery product categories. This information was then mapped to the known location of illness outbreaks. Finally, researchers ranked all grocery products by likelihood of contamination.
“Our study shows that big data and analytics can profoundly reduce investigation time and human error and have a huge impact on public health,” Hu said.
Kathleen Hickey is a freelance writer for GCN.