EPA tries out grid to put idle computers to work
- By William Jackson
- Sep 24, 2004
The Environmental Protection Agency is looking at grid computing to get more mileage out of its supercomputers.
In a proof-of-concept phase that began in January, the agency speeded up the execution of a complex air quality modeling program used by states.
The trial 'opens up doors to all kinds of uses,' said Peter Durant, associate director for information management in the agency's Office of Administration and Resources Management. 'EPA has some nice tools, models and databases that can help communities.'
Grid computing goes a step beyond distributed computing. It ties together heterogeneous, geographically dispersed IT resources to share a workload, but the users see a single system image.
'It's all about middleware,' said Ken King, vice president for grid computing at IBM Corp., which helped EPA with the project. The middleware decides how to allocate resources across the entire environment.
The assumption behind grid computing is that most computing resources are underused.
IBM has estimated that mainframes are idle 40 percent of the time, servers 90 percent of the time and desktop computers 95 percent of the time.
Earlier distributed-computing programs, such as SETI@home, the search for extraterrestrial intelligence, have harnessed the unused computing cycles on volunteers' PCs. The volunteers download and process files in batches.
But newer software based on the Open Grid Services Infrastructure 1.0 specification can distribute work across a grid as needed, rather than waiting for individual computers to download it.
OGSI is the basis of the Open Grid Services Architecture, a set of standards emerging from the Global Grid Forum.
'The standards are maturing,' King said. 'Now it's a matter of getting them implemented in vendors' products. There is still a way to go on that.' He compared grid computing's advance to that of the Linux operating system about five years ago.
Computer Sciences Corp. has been the contractor for EPA's test. It uses the Grid Toolbox, a collection of open-standards grid software from IBM, plus Red Hat Linux Enterprise 2.1 from Red Hat Inc. of Raleigh, N.C., and enterprise information integration software from Avaki Corp. of Burlington, Mass.
The grid, located at EPA's National Environmental Scientific Computing Center in Research Triangle Park, N.C., was put to the test running the agency's Community Multiscale Air Quality modeling system for atmospheric pollution.
'It requires significant amounts of data as well as computer capacity,' Durant said.
The model, developed by EPA and the National Oceanic and Atmospheric Administration, had been fine-tuned by scientists at the Energy Department's Sandia National Laboratories in New Mexico.
'They optimized the model and cut the run time tremendously,' Durant said. 'What the grid offered us was an elegant solution that lets us provide more data and computer capacity to our state partners.'
The grid made the model practical for use by organizations such as the Western Regional Air Partnership, which tries to improve air quality and visibility in national parks and wilderness areas.
The EPA center has two primary computers, a Cray T3E-1200E massively parallel processing system from Cray Inc. of Seattle and an IBM RS/6000 SP high-performance computer. The IBM SP recently was upgraded from three 16-processor nodes to four nodes, with bandwidth on demand through the center's network connections.
A tape archive of more than 60T of air data resides on a Sun Microsystems Enterprise E4500 server. A High Performance Parallel Interface connects the computers and the file server with data transfer rates up to 800 Mbps.
William Jackson is freelance writer and the author of the CyberEye blog.