Disease monitoring at Internet speed

Nowcasting: Disease monitoring at Internet speed

Do you search the Internet to diagnose your aches and pains? Turns out enough people do that scientists at Los Alamos National Laboratory have discovered they can more effectively monitor and forecast diseases by analyzing views of Wikipedia articles, according to a team from the laboratory.

The findings add to growing evidence that, “traditional, biologically-focused monitoring techniques are accurate but costly and slow,” and that new models based on social media data and Internet search – or nowcasting  – are emerging to take their place.

In the Los Alamos study researchers found that people start searching for disease-related information on Wikipedia before they seek medical attention. Using the techniques, the team monitored influenza outbreaks in the United States, Poland, Japan and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand.

The team was able to forecast all but one of these at least 28 days in advance, the lab reported.

Using tools such as Wikipedia will allow the government to more quickly react to epidemics, leading to better strategies to combat the spread of diseases on a global basis, according to the study.

“A global disease-forecasting system will improve the way we respond to epidemics,” said research team member Sara Del Valle. “In the same way we check the weather each morning, individuals and public health officials can monitor disease incidence and plan for the future based on today’s forecast.”

The scientific paper, “Global disease monitoring and forecasting with Wikipedia,” also showed how the social media tools could help facilitate the use of epidemiological models across different regions. For example, researchers could create models using data from Japan to track and forecast disease in Thailand, a capability that is particularly important for countries that do not offer reliable disease data.

The paper, written by Del Valle with Nick Generous, Geoffrey Fairchild, Alina Deshpande, and Reid Priedhorsky, was published in the Public Library of Science journal (PLoS) Computational Biology.

Meanwhile, researchers are studying how monitoring the use other social media can improve disease surveillance. An October study published by the PLoS found that parsing data from Twitter feeds also significantly improves influenza forecasting.

According to the Centers for Disease Control and Prevention, between five to 20 percent of people in the United States get the flu every year. Incorporating Twitter data into the forecasting model reduced errors by 17 to 30 percent over forecasting that only uses historical data, with nowcasts “that are two to four weeks ahead of baseline models.”

These models are better predictors than models using data from Google Flu Trends, and the data is more accessible, according to the study.

Further, recent work has demonstrated that Twitter data correlates with infection rates from the Influenza-like Illness Surveillance Network (ILINet) from the CDC at the municipal level, suggesting that web data could improve forecasts for cities as well,” the paper noted.

Uses of social media as a disease forecasting tool has been maturing over the last decade. In fact, HealthMap, a freely web site and mobile app (“Outbreaks Near Me”) developed by a team of researchers, epidemiologists and software developers at Boston Children's Hospital in 2006, predicted the spread of the recent Ebola crisis before human researchers could.

The site uses online informal sources for disease outbreak monitoring and real-time surveillance of emerging public health threats and is used by a wide range of organizations including libraries, local health departments, governments and international travelers.

“HealthMap brings together disparate data sources, including online news aggregators, eyewitness reports, expert-curated discussions and validated official reports,” noted the site. “Through an automated process, updating 24/7/365, the system monitors, organizes, integrates, filters, visualizes and disseminates online information about emerging diseases in nine languages.”

About the Author

Kathleen Hickey is a freelance writer for GCN.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/Shutterstock.com)

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected