NIH funds new tools to crack genomic big data

NIH funds new tools to crack genomic big data

As part of its Big Data to Knowledge Initiative, the National Institutes of Health recently awarded several grants to the biomedical research community for development of software  tools to handle data compression, data visualization, data provenance and data wrangling. NIH is dividing a total of $6.5 million among 15 winning recipient programs in this fiscal year.

Timely access to genomics data is critical to health care research because tracking genomic-based changes helps identify what is predisposing patients to certain diseases and the responses that treatments and therapies are generating.

Because the data associated with genomics is enabled by large quantities of DNA sequencing (expected to grow in the near future), it is of “paramount importance,” according to NIH, to find ways to efficiently, accurately and quickly compress data and to recognize techniques for sharing, accessing, visualizing and searching variously formatted genomic data.

The awards fell into four categories:

Data compression, which becomes more important as digital imaging increases and storage and compute capabilities are pushed to the limit.

Data provenance, which tracks the creation, modification and movement of data during analysis.  New provenance tools help researchers better understand the methods used by others for a particular experiment and to compute quality and trustworthiness scores for data.

Data visualization research, which helps researchers derive new insights by visualizing different data types from across multiple studies.

Data wrangling, or the use of automated tools to convert or map data across different forms to make it more accessible to a variety of applications.

Overall, the research program’s delivery of accessible high-performance software suites for managing  genomic data will help to manipulate, transfer and access massive datasets used by governmental and NIH-sponsored projects.

Better compression software will reduce the cost of data storage and analysis, and more effective sharing tools with improve researchers’ accessibility to complex data. Additionally, by requiring that these tools be open-source, NIH said, these awards open the door to future innovations and improvements based upon the initial developments.

About the Author

Amanda Ziadeh is a former reporter/producer for GCN.


  • business meeting (Monkey Business Images/

    Civic tech volunteers help states with legacy systems

    As COVID-19 exposed vulnerabilities in state and local government IT systems, the newly formed U.S. Digital Response stepped in to help. Its successes offer insight into existing barriers and the future of the civic tech movement.

  • data analytics (

    More visible data helps drive DOD decision-making

    CDOs in the Defense Department are opening up their data to take advantage of artificial intelligence and machine learning tools that help surface insights and improve decision-making.

Stay Connected