Energy labs test grid for monster downloads
- By Joab Jackson
- Mar 21, 2006
Two Energy Department laboratories, along with a number of universities, tested a grid network that will eventually distribute experimental data from the CERN particle physics laboratory in Geneva to multiple research laboratories around the globe.
This test showed how terabytes of data generated at CERN could be dispersed to multiple laboratories. While electronically shepherding large amounts of information from one location to another is a difficult problem in itself, the task grows even more complex with multiple recipients, said Ian Fisk, associate scientist at Energy's Fermi National Accelerator Laboratory, in Batavia, Ill. This demonstration showed such bulk transfers are possible, using grid software.
For this test, CERN transferred up to one aggregated gigabyte per second to 10 research centers, which then redistributed that data to dozens of other recipients.
In 2007, CERN will crank up the Large Hadron Collider, which will be the world's largest particle accelerator. The physics community around this project wants to channel the collision results to labs worldwide, which could test out advanced physics hypotheses concerning supersymmetry, string theory and the like.
This approach may tap the potential power of distributed computing. CERN itself has the most computer capacity of all the laboratories involved in the project, yet it only has 20 percent of the total computing capability involved in the project globally. The remaining 80 percent is split across the other participating partners, Fisk said.
In order to get the data out, the group set up a hierarchical model of data distribution. CERN will transmit the data from its LHC experiments to about a dozen 'Tier 1' computing centers and universities, including the Energy Department's Brookhaven National Laboratory and Fermi National Accelerator Laboratory. These centers will act as nodes, redistributing the data to additional 'Tier 2' research centers. They will also archive a second copy of the raw data.
Grid tools are essential for the job, Fisk said. For instance, Fermilab uses Storage Resource Manager
, grid middleware developed in part by Lawrence Berkeley National Laboratory. 'The SRM interface allows us to describe that large group of servers as an interface,' Fisk said. CERN sends the data from multiple servers, which are received by the numerous servers at Fermilab. SRM lends a hand in load balancing, traffic shaping, performance monitoring, authentication and resource usage accountability as well.
Grid software also presents uniform interfaces for local computing resources, Fisk said. Someone could submit a job processing request using the Condor
workload management system. 'The grid interface provides a consistent view of the batch system,' Fisk said.
The Worldwide LHC Computing Grid
collaboration assembled the infrastructure for the test, using existing resources such as the U.S. Open Science Grid
and Europe's Enabling Grids for E-SciencE
. The Energy Department supplies connectivity between Europe and the U.S. through a leased fiber link, operated by the California Institute of Technology.
Joab Jackson is the senior technology editor for Government Computer News.