Routine data sharing prepares for public health emergencies


Routine data sharing prepares for public health emergencies

In February 2016, Wellcome Trust organized a pledge among leading scientific organizations and health agencies encouraging researchers to release data relevant to the Zika outbreak as rapidly and widely as possible. This initiative echoed a September 2015 World Health Organization (WHO) consultation that assessed data sharing during the recent West Africa Ebola outbreak and called on researchers to make data publicly available during public health emergencies. These statements were necessary because the traditional way of communicating research results -- publication in peer-reviewed journals, often months or years after data collection -- is too slow during an emergency.

The acute health threat of outbreaks provides a strong argument for more complete, quick and broad sharing of research data during emergencies. But the Ebola and Zika outbreaks suggest that data sharing cannot be limited to emergencies without compromising emergency preparedness. To prepare for future outbreaks, the scientific community should expand data sharing for all health research.

Open science, Ebola and Zika

In the health sciences, an important milestone for openness was achieved 20 years ago, as genetic sequencing began to generate massive amounts of data and scientists agreed to deposit sequences in public databases almost as they were produced. Encouraged by the discoveries this facilitated, life science leaders called for openness for other types of data, and today, the movement towards open science is evident across the health sciences landscape.

During the Ebola outbreak, researchers unaffiliated with official response efforts transformed surveillance reports into machine-readable formats and shared them in public repositories, and some teams assisting the response rapidly deposited Ebola virus genetic sequences into public databases. These efforts allowed many scientists to contribute analytical insights -- 80 percent of peer-reviewed epidemiological modeling studies published during the outbreak used only open data. Many researchers also shared computer code of their models online.

As the Zika epidemic highlighted major deficiencies in knowledge of the virus and disease, leading scientific journals agreed to make all Zika-related content free to access. As during Ebola, scientists established a public repository for sharing Zika data.

Data-sharing challenges

Despite these successes, the Ebola and Zika responses also highlight openness challenges for effective data sharing. Besides scholarly or competitive disincentives for data sharing, scientists may not be able to share data effectively because of inadequate technology, standards or human capacity.

One of the reasons researchers could share genetic sequences effectively during the Ebola and Zika outbreaks was their familiarity with public databases designed for such data (e.g., GenBank). Widely accepted central databases do not exist for other types of research data. Some platforms are little more than “data dumpsters” without the metadata, data dictionaries or documentation required for responsible analysis. Any data-sharing arrangement faces the challenge of protecting patient privacy while preserving the usefulness of the data shared, a topic of active methodological research.

Obstacles are even more significant in lower-resource settings. A review of the Ebola response found that affected countries lacked integrated standards for data collection and that data was not “aggregated, analyzed or shared in a timely manner and in some cases not at all.” Sharing data in a useful way requires staff time, technical infrastructure and human capacities that may not be available in low-resource settings. These essential elements of effective data sharing cannot be expected to materialize during a crisis.

Preparing for the next surprise

Open data deserves recognition and support as a key component of emergency preparedness. Initiatives to facilitate discovery of datasets and track their use, to establish common platforms for sharing and integrating research data and to improve data-sharing capacity in resource-limited areas are critical to improving preparedness and response.

Integrating open science approaches into routine research should make data sharing more effective during emergencies, but this evolution is more than just practice for emergencies. The cause and context of the next outbreak are unknowable; research that seems routine now may be critical tomorrow. Establishing openness as the standard will help build the scientific foundation needed to contain the next outbreak.

Recent epidemics were surprises -- Zika and chikungunya sweeping through the Americas and an Ebola pandemic with more than 10,000 deaths -- and we can be sure there are more surprises to come. Opening all research provides the best chance to accelerate discovery and development that will help during the next surprise.

A longer version of this article is available on PLOS.

About the Authors

Jean-Paul Chretien is a military public health physician-epidemiologist with the Defense Health Agency.

Caitlin M. Rivers is a computational epidemiologist with the Army Institute of Public Health.

Michael A. Johansson is a biologist at the Centers for Disease Control and Prevention.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected