finding food poisoning through yelp reviews

Finding food poisoning cases via Yelp reviews

Restaurant diners afflicted with food poisoning tend to share their gastrointestinal distress on social media, rather than reporting the illness to the local health department, which monitors reported food poisoning so it can curtail future outbreaks.


Cities tap Yelp to improve health inspection process

Restaurant hygiene inspection data on Yelp is helping city health inspectors better target offenders and pushing restaurants to clean up their act. Read more.

Social media helps officials spot public health threats -- but only for the rich?

Knowing the demographic and socioeconomic breakdown of social media-generated data can better shape the design of research studies and public health surveillance systems. Read more.

IBM harvests supermarket data to spot, predict foodborne illnesses

The company’s system will help public health departments and health care providers quickly identify and predict sources of foodborne illnesses. Read more.

Vasudha Reddy and her colleges at the New York City Department of Health and Mental Hygiene noticed mentions of food poisoning among Yelp reviews when they were tracking down an outbreak a few years ago. That’s when the city decided to partner with Columbia University to analyze Yelp comments for keywords indicating foodborne illness in the city.

“We realized there might be people who are not aware of reporting through the mechanism of 311,” said Reddy, a foodborne disease epidemiologist, adding that 311 calls or online complaints to the city remain the preferred mechanism for reporting these issues.

To sift through the comments, Columbia receives a daily feed from Yelp that is then analyzed by an algorithm that “breaks down the raw review text into words and phrases of up to three consecutive words,” Reddy and Thomas Effland, a computer science Ph.D. student at Columbia, said in a written explanation of the system. “It then uses the counts of these words and phrases to predict if the review discusses foodborne illness or not.”

The algorithm is a well-known machine learning algorithm, but the training and data labeling has been done by city epidemiologists.

It is able to pick up key words like “sick,” “vomiting,” “diarrhea,” “food poisoning” and others that could be indicative of illness occurring at the restaurant.  The comments and reviews that trip the algorithms are then sent onto the Health Department where officials follow up with the commenter for verification.

“We set aside time every day to look at all the reviews that the computer algorithm spits out,” Reddy said.

The system has been able to identify 8,523 complaints of foodborne illness since 2012. There have been about 28,000 in total, so 311 is still the main source of complaints, she said. Since 2012, the algorithm has helped the Health Department identify 10 outbreaks, which are defined as two or more cases of gastrointestinal illness occurring within 30 days of each other and associated with eating at the same restaurant.

Since its initial release, the process has undergone improvements that has made it more accurate. This includes using a better classification algorithm and providing it with more and better data for training.

 Luis Gravano and Daniel Hsu, professors of computer science at Columbia Engineering and coauthors of a recent study on the Health Department system, said it has already improved the detection of outbreaks of foodborne illnesses.

"Effective information extraction regarding foodborne illness from social media is of high importance -- online restaurant review sites are popular, and many people are more likely to discuss food poisoning incidents in such sites than on official government channels," said Gravano and Hsu.

About the Author

Matt Leonard is a former reporter for GCN.

Stay Connected

Sign up for our newsletter.

I agree to this site's Privacy Policy.