Energy project looks to tame supercomputing's tsunami of data
The ongoing rush to develop more powerful supercomputers for a variety of research projects threatens to bury scientists under such a tidal wave of data that making use of that data could be difficult.
So the Energy Department’s Argonne National Laboratory has opened a new institute devoted to developing tools to help researchers efficiently sift usable information out of petabytes of data.
The Scalable Data Management, Analysis and Visualization (SDAV) Institute aims to develop ways to let scientists spend less time sifting through data and more time on science, Robert Ross, a computer scientist at Argonne and the institute’s deputy director, said in a statement.
White House launches $200M 'Big Data R&D' initiative
Keys to big data in the brain, not the computer, former NSA exec says
Energy's labs research everything from aerodynamics to fuel efficiency, with many projects producing petabytes of data, according to the agency.
“The task of handling this data is overwhelming, forcing scientists to spend much of their time developing special-purpose solutions to store, access and manage the information,” Ross said. “The SDAV teams will develop the necessary tools and software so that scientists can use their time more effectively for scientific investigation and discovery.”
The SDAV project is funded by part of a $25 million grant from DOE’s Office of Science intended to find ways of extracting usable information from large data sets.
The facility will focus on three areas:
- Data management to enable the query of scientific data sets.
- Data analysis techniques for both in-process and post-processing data analysis.
- Developing data visualization tools to identify and understand features in multiscale, multiphysics data sets.
As part of DOE’s Big Data Research and Development Initiative, the SDAV Institute will work with a number of national computing facilities to ensure the successful deployment and adoption of SDAV tools and software. Institute partners include the Argonne Leadership Computing Facility, the National Energy Research Scientific Computing Center at the Lawrence Berkeley National Laboratory, and the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory. These facilities are responsible for installing the new technologies developed by the SDAV team, Argonne officials said.
All three supercomputing centers are supported by DOE’s Office of Science. These partner facilities will alert the SDAV team about upcoming system architectures to guide the development of SDAV tools to ensure that they will be effective as new systems come online, Argonne officials said.
The SDAV team also plans to hold tutorials and workshops to reach a broader community. The workshops will serve to gather information from other researchers and to train potential users, Argonne officials said. These efforts will be coordinated with DOE computing facility activities and major conferences.
The SDAV Institute also combines expertise from three successful DOE Scientific Discovery through Advanced Computing (SciDAC) centers and institutes: the SciDAC Scientific Data Management Center for Enabling Technologies; the Visualization and Analytics Center for Enabling Technologies; and the Institute for Ultra-Scale Visualization.