Energy labs test grid for monster downloads

Two Energy Department laboratories, along with a number of universities, tested a grid network that will eventually distribute experimental data from the CERN particle physics laboratory in Geneva to multiple research laboratories around the globe.

This test showed how terabytes of data generated at CERN could be dispersed to multiple laboratories. While electronically shepherding large amounts of information from one location to another is a difficult problem in itself, the task grows even more complex with multiple recipients, said Ian Fisk, associate scientist at Energy's Fermi National Accelerator Laboratory, in Batavia, Ill. This demonstration showed such bulk transfers are possible, using grid software.

For this test, CERN transferred up to one aggregated gigabyte per second to 10 research centers, which then redistributed that data to dozens of other recipients.

In 2007, CERN will crank up the Large Hadron Collider, which will be the world's largest particle accelerator. The physics community around this project wants to channel the collision results to labs worldwide, which could test out advanced physics hypotheses concerning supersymmetry, string theory and the like.

This approach may tap the potential power of distributed computing. CERN itself has the most computer capacity of all the laboratories involved in the project, yet it only has 20 percent of the total computing capability involved in the project globally. The remaining 80 percent is split across the other participating partners, Fisk said.

In order to get the data out, the group set up a hierarchical model of data distribution. CERN will transmit the data from its LHC experiments to about a dozen 'Tier 1' computing centers and universities, including the Energy Department's Brookhaven National Laboratory and Fermi National Accelerator Laboratory. These centers will act as nodes, redistributing the data to additional 'Tier 2' research centers. They will also archive a second copy of the raw data.

Grid tools are essential for the job, Fisk said. For instance, Fermilab uses Storage Resource Manager, grid middleware developed in part by Lawrence Berkeley National Laboratory. 'The SRM interface allows us to describe that large group of servers as an interface,' Fisk said. CERN sends the data from multiple servers, which are received by the numerous servers at Fermilab. SRM lends a hand in load balancing, traffic shaping, performance monitoring, authentication and resource usage accountability as well.

Grid software also presents uniform interfaces for local computing resources, Fisk said. Someone could submit a job processing request using the Condor workload management system. 'The grid interface provides a consistent view of the batch system,' Fisk said.

The Worldwide LHC Computing Grid collaboration assembled the infrastructure for the test, using existing resources such as the U.S. Open Science Grid and Europe's Enabling Grids for E-SciencE. The Energy Department supplies connectivity between Europe and the U.S. through a leased fiber link, operated by the California Institute of Technology.

About the Author

Joab Jackson is the senior technology editor for Government Computer News.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected