NSF seeks big data middleware for scientific computing
The National Science Foundation wants to explore ways new commercial big data tools might be brought to bear on the data management and analysis challenges of research-oriented scientific computing.
The NSF awarded a $5 million grant to university teams from Arizona, Emory, Indiana, Kansas, Rutgers, Virginia Tech and Utah to take up the project, dubbed Middleware for Data-Intensive Analytics and Science, or MIDAS.
“Many scientific problems depend on the ability to analyze and compute on large amounts of data,” NSF said in its grant announcement. “This analysis often does not scale well; its effectiveness is hampered by the increasing volume, variety and rate of change of big data.”
The project will integrate features of traditional high-performance computing (HPC), such as scientific libraries, communication and resource management middleware, “with the rich set of capabilities found in the commercial big data ecosystem” NSF said.
“That includes software systems such as Hadoop, “ NSF said, available from the Apache open source community. Hadoop is an open-source framework for processing large datasets across various storage and computer clusters.
The NSF project will design MIDAS middleware that “will enable scalable applications with the performance of HPC and the rich functionality of the commodity Apache big data stack,” according to the NSF award announcement.
NSF said the project will address major data challenges in seven different scientific communities, including biomolecular simulations, computational social science, computer vision, pathology informatics and geographical information systems.
Project libraries associated with these research areas will be developed to be scalable and interoperable across a range of computing systems including cloud, clusters and supercomputers, NSF said.
Connect with the GCN staff on Twitter @GCNtech.