HHS on a mission to liberate health data
- By Amanda Ziadeh
- Jun 05, 2015
At the recent Health Datapalooza, the Department of Health and Human Services outlined progress in “liberating” health data to reach researchers, policy makers, innovators, various agencies and the public as a whole.
HealthData.gov and HHS.gov
According to Damon Davis, the director for the health data initiative at HHS, HealthData.gov is a catalog of health, social services and research data made available to the public to improve the country’s health.
The relaunch includes user-friendly tools and updates the platform’s underlying technology for more efficient performance, Davis said in a blog, in an effort to nurture more applications, products and services capable of enhancing health care.
The project started with the migration of the catalog content to the DKAN open data and open source platform, which is the same technology used by Data.gov. New data, like the latest in medical and scientific knowledge, clinical care provider quality, health service provider directories, community health performance information and government spending are now available.
The design and features of HealthData.gov will continue to improve based on the beta version’s workability and feedback. In time, Davis said, the site will include better search and sort tools, charts and maps, linkage to other relative datasets and, possibly, a system for requesting and discussing data.
Late last year HHS.gov got a refresh, shifting to responsive design and building a new, open “table” tag in the CSS to handle the number and variety of tabular presentations. That code snippet is available on the Mobile Code Sharing Catalog so other agencies can use it for their efforts.
Recent improvements include new mobile-first design, making it accessible on a multitude of devices and networks. Along with smart search and tools for easier engagement and content sharing, HHS purged of 154,000 obsolete files to make search faster and more relevant. The information has been filtered and organized by topic so users can quickly find exactly what they need.
Demand-Driven Open Data framework
HHS is also using Demand-Driven Open Data framework, or DDOD. Developed through the HHS IDEA Lab, DDOD “gives external organizations a way to tell HHS what data they need and how they want to consume it,” said David Portnoy, an entrepreneur-in-residence at the HHS IDEA Lab.
HHS found that its data owners were releasing datasets that were easy to generate and least risky to release, without much regard to what data consumers could really use. The DDOD framework lets HHS prioritize data releases based on the data’s value because, as every request is considered a use case.
It lets users -- be they researchers, nonprofits or local governments -- request data in a systematic, ongoing and transparent way and ensures there will be data consumers for information that’s released, providing immediate, quantifiable value to both the consumer and HHS.
Users sign up, then enter the specific information they need in the DDOD Github repository and route it to the appropriate data owners. The DDOD team evaluates the use cases on both qualitative (value to public health, for example) and quantitative measures, such as cost avoidance.
DDOD engages the requesters in requirements management, community engagement, voting and validation to facilitate the data request. And the team works with the data owner to implement a solution. Once a use case is implemented, there are scheduled releases and new features.
DDOD’s knowledge base is on the DDOD wiki, so the public and data owners can contribute. If there are frequently asked questions regarding why certain data is not available, the form in which it is available or where it can be found, the answer is most likely already posted and made public.
DDOD lets HHS better understand what users want and need, Portnoy said: “You have the push from the expert side, and you have the demand from the community out there, and it all comes together in one place in HealthData.gov.”
NIH is also expanding its open data network with the community and better connecting and transitioning data among its 27 institution centers.
“Our approach to this is to establish a Commons,” said Phil Bourne, NIH’s associate director for data science. The Commons is simply a virtually shared space made available on different platforms, including public and private clouds, within agreement that all resources residing in the Commons are compliant with specific open data rules.
“To be Commons compliant means that [data providers] agree to identify all of the research objects that go into the space by a unique identifier,” Bourne said. NIH is in the prototype and piloting stages of developing an indexing tool to index content and create available catalogs.
Additionally, NIH is working on data-level metrics, which will also allow it to measure the usefulness of data in ways it could not before. By tracking citations, recommendations and impact statements of data, NIH can explore and test the metrics needed to capture activity surrounding research data. Eventually, requesters will be able to review datasets previous researchers have worked with, see who has commented on them, who else has used those datasets and what they was used for. Further, the process should encourage researchers to share their data.
“We’re looking to drive more structure into the data with this Commons environment,” Bourne said. NIH hopes that a well-formed dataset with greater prominence and more metadata will further encourage researchers to use that dataset as a reference, and in turn, provide even more metadata.
Amanda Ziadeh is a former reporter/producer for GCN.