White House working group releases strategy for digital scientific data
An interagency working group has recommended that the United States develop a strategic policy for preserving and making scientific information accessible in a world in which data increasingly is born, stored and used in digital formats.
“Our nation’s continuing leadership in science relies increasingly on effective and reliable access to digital scientific data,” John H. Marburger III, director of the president’s Office of Science and Technology Policy (OSTP), said in releasing the report. “Researchers and students who can find and re-use digital data are able to apply them in innovative ways and novel combinations for discovery and understanding.”
The report, titled “Harnessing the Power of Digital Data for Science and Society,” calls for development of interagency and agency-specific policies for the management of data produced by or for government throughout its life cycle. The policies not only would fulfill obligations to make public information truly public, but would benefit the nation’s economy and its research and development communities.
The report was written by representatives of 22 agencies that comprise the Interagency Working Group on Digital Data of the National Science and Technology Council (NSTC). The NSTC is a Cabinet-level council that operates under the White House Office of Science and Technology Policy.
Although the report is being pushed now as an example of the Obama administration’s policy of democratizing data and putting more of an emphasis on science, it was published in January before the inauguration. Newly appointed Federal Chief Information Officer Vivek Kundra has proposed a central online repository for such information at data.gov, which OSTP has said it is working to implement. The report does not mention data.gov as a vehicle for implementing its recommendations.
The emphasis on managing digital data has been made necessary by developments in scientific research, which is becoming increasingly big, distributed and digitized.
“Science and engineering research and education are increasingly digital,” said Arden Bement, director of the National Science Foundation and an NSTC member. “New observation systems are prime examples, expanding the scales for conducting observations from the sub-atomic to the cosmic. A broad framework for promoting continuing access and interoperability for scientific data is key to progress in this digital age.”
Characteristics of the current digital data landscape are described in the report as:
- The starting points for new research increasingly are digital and their products increasingly are born digital.
- Innovations in digital technology are driving the exploding volumes and rising demand for data.
- All sectors of society are stakeholders in digital preservation and access.
- A comprehensive framework for cooperation to manage the preservation of digital data is missing.
The goal of the report’s recommendations is to bring together the increasing volumes of digital data with the people who will be using and taking advantage of it. It lays out a vision and a strategy for achieving it. The vision is to “create a comprehensive framework of transparent, evolvable, extensible policies and management and organizational structures that provide reliable, effective access to the full spectrum of public digital scientific data. Such a framework will serve as a driving force for American leadership in science and in a competitive, global information society.”
To pursue this strategy, the report recommends:
- An NSTC subcommittee for digital scientific data preservation, access, and interoperability be created. This would focus on goals requiring broad coordination.
- Appropriate departments and agencies lay the foundations for agency digital scientific data policy and make the policy publicly available. In doing this, agencies should consider all components of a comprehensive policy to address the full management life cycle.
- The agencies promote a data management planning process for projects that generate data that should be preserved.