IBM harvests supermarket data to spot, predict foodborne illnesses
Tis the season for socializing and picnicking in backyards and parks across the nation. But that also means it’s high season for foodborne illnesses, when a contamination can spread across the nation’s food chain. A number of tools have been introduced in recent years to help public health departments track the path of food poisoning and other food borne illnesses.
The City of Chicago Department of Public Health, for example, was the first to test the potential of social media in identifying foodborne outbreaks, according to the Journal of the American Medical Association. Together with the Smart Chicago Collaborative, it is developing apps to monitor Twitter for possible food poisoning references. A similar project is underway in New York, where the New York City Department of Health and Mental Hygiene is working with Columbia University and reviews website Yelp to filter restaurant-goers comments for clues to an outbreak.
Elsewhere, the Food and Drug Administration in 2012 introduced iRISK, a Web-based system to analyze data on microbial and chemical hazards in food and estimate their impact on a population. The tool is designed to enable users to conduct “fully quantitative, fully probabilistic risk assessments of food safety hazards relative rapidly and efficiently,” according to the FDA.
The latest technology for checking the health of the food supply debuted this month when IBM introduced an analytic system it said could help public health departments not only track but predict contaminations in the food supply and accelerate the health care response.
IBM, which recently published its research on the project in the journal PLOS Computational Biology, described the tool as a “breakthrough” technology, capable of identifying contaminated products, “within as few as 10 outbreak case reports.”
The IBM system uses algorithms, visualization and statistical techniques to sort through date and location data on “billions” of products in the food supply to help identify “guilty” or contaminated products, according to the firm.
To help accelerate the investigation, IBM is using petabytes of food-based sales data in inventory systems used by food retailers and distributors, some of which manage up to 30,000 food items at any point in time.
IBM’s system “automatically identifies, contextualizes and displays data from multiple sources to help reduce the time to identify the mostly likely contaminated sources by a factor of days or weeks,” according to the company.
The system also integrates retail data with geocoded public health data to allow investigators to map the distribution of suspect foods. Researchers can look at geographic information on a map and access case and lab reports from clinical encounters in specific locations.
The algorithm also learns from each new report and recalculates the probability of each food that might be causing the illness.
“Predictive analytics based on location, content and context are driving our ability to quickly discover hidden patterns and relationships from diverse public health and retail data," said James Kaufman, manager of public health research for IBM Research.
In announcing the research, IBM pointed out that the speed of an investigation often depends on upon food industry firms to supply relevant data to be analyzed.
“This can be achieved by combining innovative software technology with already existing data and the willingness to share this information in crisis situations between private and public sector organizations," said Dr. Bernd Appel, head of the Department Biological Safety at the German Federal Institute, which is working with IBM on the research.
IBM is working with public health organizations and retailers in the United States to scale the research prototype and begin processing information from 1.7 billion supermarket items that are sold each week in the country, according to Kaufman.
Connect with the GCN staff on Twitter @GCNtech.