DOE explores cloud computing for big science
Approach could fill a need for midrange computational projects
- By William Jackson
- Oct 23, 2009
The Energy Department is spending $32 million for cloud computing test beds at its Argonne and Lawrence Berkeley national laboratories to explore the demand for — and feasibility of — cloud computing for large-scale scientific experiments and computation.
Business intell speeds Recovery Act reporting for DOE
DOE already hosts some of the world’s most powerful supercomputers used for “big science” projects. The Magellan project, named for the Magellanic Clouds, which are in turn named for the Portuguese explorer, will test whether cloud computing can fill a need for midrange computing that requires resources smaller than those of supercomputers but larger than the capacity that individual projects or organizations can easily provide.
“Our sense is that there is a huge demand for computing that is not being met,” said Ian Foster, director of the Computational Institute run by Argonne and the University of Chicago. “Part of our goal is to see what happens when everyone who can get access to it is using it.”
The resources will be used in a variety of disciplines, including biology, climate change and physics.
The Magellan platforms will be comprised of thousands of Intel Xeon 5500 series processors, also known as Nehalem chips, housed at the Argonne Leadership Computing Facility in Illinois and the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley in California. They also will include storage and servers that create Science Gateways and will be connected by DOE's 100 gigabit/sec Energy Sciences Network. Performance-monitoring software will be used to evaluate the system’s use.
The project is receiving funding through the economic stimulus law.
“It will be primarily for [DOE's] scientific community, but that is a pretty broad community,” said NERSC Director Kathy Yelick. About two-thirds of the department’s researchers work at universities.
Plans call for equipment for the two test beds to be installed during the next several months, becoming operational early next year. They will operate independently at first, but they could be integrated later in the program. Integration could ease the job of storing data on one site and doing computations on another. It would also make it easier to run different stages of computations on different segments of the cloud. However, experiments are not expected to be run across the cloud on separate segments simultaneously because cloud computing is intended to provide centralized resources.
Cloud computing is a model for providing on-demand access to a shared pool of configurable computing resources, provisioned as needed. Smaller scientific computational problems often are run on departmental clusters with custom software. Cloud computing centralizes resources, which could create efficiencies of scale and enable scientists to scale up to solve larger scientific problems.
Although DOE also hosts supercomputers as part of its Leadership Class Computing Program, Magellan will have a separate infrastructure, running computational problems on a different scale from those typically done on supercomputers. Those midrange jobs typically will require parallel processing on about 100 to 200 processors at the upper end. Jobs on supercomputers typically require thousands of parallel processors, Yelick said.
“You could buy a system like this for a reasonable amount of money,” Yelick said of the smaller clusters. Magellan will help to discover whether the cloud model could be more efficient and cost-effective.
The projects' leaders say the scientific community will find ways to use any new computing capacity made available to it.
A successful scientific cloud computing platform could handle a lot of midrange jobs but still is not likely to free additional time on existing supercomputers, Yelick said. Although supercomputers also run smaller jobs, those jobs often are scheduled between larger projects and a lot of them are steps in scaling up projects for supercomputer use.
“I don’t think any of these smaller jobs will be going away,” she said.
William Jackson is freelance writer and the author of the CyberEye blog.