Power processing for all
Clusters open high-performance systems to new users ' if you set them up right
UNCLE SAM, as a personification of the United States, first started making the rounds during the War of 1812. Uncle Sim is a somewhat more recent incarnation.
Uncle Sim is the nickname given to researchers running simulations at locations such as the Aeronautical Systems Center Major Shared Resource Center (ASC MSRC) at Wright-Patterson Air Force Base, Ohio. The three supercomputers at the site have explored how the weapons systems on Virginia-class submarines respond to deep-sea pressures, how to protect Humvees from improvised explosive devices, and what conditions send an F-22 fighter jet into a flat spin.
'We can do that much better with simulation than we are able to do with test-and-evaluation or wind-tunnel testing,' said Jeff Graham, ASC MSRC's technical director.
As part of the Defense Department's High Performance Computing Modernization Program (HPCMP), ASC receives a new supercomputer every two years and retires the oldest system. In October, it added a 9,216-processorcore SGI Altix 4700 system running Red Hat Linux with 20T of shared memory and 440T of usable disk space. Delivering more than 60 teraflops, it raises the center's capacity to 85 teraflops ' more than triple what was available previously.
'We've been doing these technology insertions for seven years,' Graham said. 'A lot of the lessons learned have been integrated into the [request for proposals] that goes out every year.'Clusters and code
High-performance computing, or HPC, is one area in which government has a clear lead over the private sector. The 280-teraflop IBM BlueGene/L cluster at Lawrence Livermore National Laboratory tops the list of the world's 500 fastest supercomputers (www.top500.org), and five others at federal sites are in the top 10.
In 2008, the National Center for Computational Sciences plans to introduce a petaflop system. The fastest systems make headlines when they set new records, but the real news is what is happening at the low end. Price cuts are making HPC available to organizations without millions to spend.
'Entry-level systems have been one of the major drivers in the market this decade,' said Christopher Willard, senior research consultant at Tabor Communications. 'High tech is invading every segment of the market.'
From a technology standpoint, he said, the major change has been that Linux clusters now dominate the HPC field, with 373 clusters making the June 2007 Top 500 list compared with only one a decade earlier.
'When I started doing clustered computing, it was a strongly held opinion in the HPC space that the commodity clusters would never be successful, let alone the dominant player,' said Donald Becker, who in 1994 built the world's first Linux cluster ' the Beowulf Cluster ' out of commodity computers at NASA's Goddard Space Flight Center. Becker is now chief technology officer at cluster manufacturer Penguin Computing.
Becker said that before the Beowulf Cluster, designers were concerned with how to design the hardware so the machines would work together. Now the emphasis is on designing software that makes the hardware operate as a single, high-performance machine. Because they deliver a much better price-to-performance ratio than purpose-built HPC systems, Linux clusters have led an extension of HPC downward in the market in addition to rapidly increasing capabilities at the top end. Such clusters, although they deliver an excellent return on the hardware price, are less power efficient than hardware designed for HPC applications, which can tilt the cost equation.
'Computing has become very inexpensive, while our power costs have only gone up,' Becker said. 'If you have an application that fits well on a machine that uses substantially less power, that is a motivation to look at proprietary hardware.'
Gartner Group analyst Phil Dawson said software availability also is important.
'As hardware prices drop, it is cheaper to spend one dollar on hardware and run suboptimal code than spend 10 dollars on programming,' he said.
'That's where you have the cluster x86 platform vs.
Willard also said various types of accelerators ' specialized, faster processors ' can be added to the cluster mix to speed calculations that are a costly part of applications. Accelerators are typically attached to a regular processor and do not run an operating system.
'Accelerators are at the entry level on the market,' he said. 'It should be very exciting over the next few years to see how they work out and which technologies win.'
Another recent development is the entry of multicore processors into HPCs. Although these generally work well for standard business applications, Willard said it isn't clear whether they are a good match for technical computing.
'You have to parallelize your code and make the applications more complex. To a degree, it just shifts the bottleneck from one part of the computer architecture to another.'
Cray Henry, HPCMP director, said he believes multicore processors will work ' offering lower power consumption and heat generation ' although they will require software rewrites.
'Over the next decade we will be building supercomputers with at least hundreds ' and perhaps thousands ' of cores in each socket and tens of thousands of sockets in each machine,' Henry said. 'This dramatic increase in the hardware's ability to support software application concurrency will require us to completely retool our mainstream software applications and invest in the development of new fundamental approaches to solving complex equations with these computers.'Custom-built computers
Henry oversees eight supercomputer centers serving 5,000 scientists and engineers.
HPCMP has an annual technology refresh program, which it calls technology insertion, so he is familiar with what it takes to create an RFP for a high-performance system. At the low end, these systems can be routine, but large systems are custom-built.
'When the acquisition targets a large supercomputer ' anything larger than 4,000 sockets ' the agency needs to be aware that the vendor will likely be building a system that has never been built before,' he said. 'Plan for delays, include flexibility where possible in your schedule and understand how you can best structure your acquisition to motivate the most important behavior you would like to see from your vendor.'
The first step is determining what type of architecture best fits the needs of the organization.
Oak Ridge National Laboratory, for example, has several supercomputers that run large, parallel applications, such as those involved in global climate modeling and the design of fusion reactors. They include a 119-teraflop Cray XT4 with 23,000 processors, a Cray X1E with 1,024 vector processors and an IBM BlueGene/P system with more than 8,000 processors.
'We have different systems for different applications,' said Arthur Bland, project director at Oak Ridge's National Center for Computational Sciences. The Cray XT4, he said, 'is an exceptionally well-balanced system that provides the best overall performance on our mix of applications,' and the Cray X1E and IBM BlueGene/P have other classes of applications they are best suited for.
In selecting a system, raw server benchmarks don't tell the whole story. The process requires looking at the entire system, including interconnections, memory, storage and software stack.
'Agencies need to ensure they understand their requirements, the service levels their mission dictates and the applications that will run on the systems,' Henry said. 'These all need to be articulated clearly for vendors in the RFP.'
For its most recent acquisition, ASC MSRC opted for a system with shared memory because other HPCMP sites had machines with distributed memory, but certain applications run best with shared memory. The RFP process evaluated systems based on usability, performance and price/performance. Graham and his staff weren't interested in buying a leading edge system but one they knew would be stable.
'We stay half a pace behind the bleeding edge,' Graham said. 'We bring in systems that will work from Day 1.'
SGI won the contract, but the system then had to go through two 30-day testing periods to verify its performance and reliability before final acceptance. It passed the tests and went live on schedule Oct. 1.