Science community seeks standard data formats

Science community building standard data formats

In the medical and scientific fields, data and file sharing are critical because they accelerate discoveries and breakthroughs.  But research can still be frustrated by the lack of data sharing standards, as is the case in neuroscience.

Unlike common data sharing file formats, such as JPEG for images, the neuroscience field does not have standard formats.  To address this situation, the Neurodata without Borders project – a yearlong initiative to create a “unified data format for cellular-based, neurophysiology data based on representative use cases” – hosted a hackathon late last year to brainstorm ideas for standard file formatting.

In addition to the Nerodata without Borders project, other prominent laboratories are working to develop a global portal for scientists and researchers to share information and data without having to download special software. 

“This issue of standardizing data formats and sharing files isn’t unique to neuroscience. Many science areas, including the global climate community, have grappled with this,” said Oliver Ruebel, computational scientist at the Lawrence Berkeley National Lab.  “Sharing data allows researchers to do larger, more comprehensive studies. This in-turn increases confidence in scientific results and ultimately leads to breakthroughs.” 

Standards for neuroscience data have become increasingly important with the Obama administration’s BRAIN Initiative, for instance, which challenged the neuroscience community to discover new ways to address brain diseases and trauma.

This work is expected to generate a deluge of data, according to Berkley Lab, so before researchers can even begin taking measurements, they must first develop a standard format for labeling and organizing data, sharing files, and scaling up software to handle massive amounts of information.

To come up with those conventions, Ruebel worked closely with Berkeley Lab scientists Peter Denes and Kristofer Bouchard and UCSF neurosurgeon Edward Chang to design BrainFormat, a neuroscience data standardization framework. It uses open source Hierarchical Data Format (HDF) technologies, which has helped a variety of scientific disciplines organize and share their data.

In addition to data format standardization, HDF is also optimized to run on supercomputers. So by building BrainFormat on this technology, Berkley Lab said, neuroscientists will be able to use supercomputers to process and analyze their massive datasets.  

About the Author

Mark Pomerleau is a former editorial fellow with GCN and Defense Systems.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected