The making of a supercomputing champ

The Oak Ridge National Laboratory in Tennessee is upgrading its Jaguar supercomputer, moving to the latest model of Opteron processors and adding graphical processing units. The enhancement could produce a tenfold increase in processing power and give it a shot at the world title for speed when completed.

Jaguar, a Cray XT5 system, already was the fastest supercomputer in the United States, with a performance of 1.75 petaflops (or 1,750 trillion calculations per second) in the most recent Top500 rankings released in November. It ranked behind Japan’s champion K Computer, which was the first to break the 10 petaflops barrier,

Although Oak Ridge clocked the Jaguar at 2.33 petaflops, it still would come in third behind the Chinese Tianhe-1A, which was in second place in the latest rankings, at 2.57 petaflops.


Related coverage:

Oak Ridge lab shuts down e-mail, Internet after cyberattack

DOE completes first cross-country 100-gigabit network




The first phase of the Jaguar upgrade was completed in February, when two six-core AMD Opteron processors in each of Jaguar’s 18,688 nodes were replaced with a single 16-core 6200 series Opteron processor. About one-third of the nodes also have been equipped with an NVIDIA graphical processing unit. That boosted its speed to 3.3 petaflops.

When the current upgrade is completed this fall, the new system will be renamed Titan and its speed could reach from 10 to 20 petaflops, depending on the funding available for the project, said Richard Graham, the applications performance tools group leader of the Oak Ridge Labs' Computer Science and Mathematics Division.

The current project is the most extensive in a continuing series of improvements to Jaguar, he said.

“It’s been upgraded five or six times over the last few years,” Graham said. “This is a step beyond,” creating a whole new architecture of Jaguar/Titan with the addition of the enormous computing power of the GPUs. “That’s where most of the performance is going to be obtained.”

Getting the new hardware in place is only the first step in creating a world-class supercomputer, however. “We have to figure out how to restructure the algorithms so they can take advantage of the new architecture,” Graham said.

Jaguar is located at the Oak Ridge Leadership Computing Facility where it occupies 284 cabinets and is billed as the world’s fastest supercomputer for unclassified research. Its nodes run Cray’s version of the SuSE Linux OS, which has been trimmed of unnecessary services. It is used by a variety of scientists from the Energy Department’s national laboratories and academic institutions for complex simulations in materials, climate, chemistry and physics. It is a pioneer in petascale computing, and in 2009 28 research teams from around the world used it to produce high-resolution climate models, calculations of uranium moving into the Columbia River from aging underground storage facilities, and studies of the production of bioenergy.

The processing power of modern supercomputers allows scientists not only to solve problems and run simulations more quickly but to do things of an order of complexity that would not otherwise be possible. GPUs are helping to boost that power.

Graphical processing units have been around for quite a while, but only relatively recently have they been used so that their processing power could be employed for a wide range of applications. They have quickly caught on, however. According to the Top500, Japan’s K was the only top-tier system that is not using graphics processors or other accelerators. It achieved its top ranking through sheer size, using 705,024 SPARC64 processing cores.

GPUs are used as attached processors. The large amounts of data used in simulations have to be moved onto them and the results taken off of them efficiently. This requires new algorithms to do that efficiently, which is not a trivial task, Graham said. It can take years to bring applications up to speed to take advantage of the power of a radically new architecture such as the one envisioned for Titan.

Graham's goal is to provide tools to let scientists port their applications to the new system quickly rather than spending several years getting one or two applications started. “That’s the only way this architecture is going to be successful,” he said.

The new Titan probably will be used primarily to enable processes that could not have been done on Jaguar, rather than speed up applications already in use, Graham said. It probably would not be cost-effective to merely speed up existing tools.

With the fastest supercomputer today clocking in at 10.51 petaflops, Titan could be ready to match or exceed it when fully upgraded. But that does not necessarily mean that it will be in first place.

The Top500 list is compiled twice a year by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of the National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee, Knoxville. There will be at least one new list published before the Titan upgrade is done, and processing speeds have been climbing steadily. But despite increases in speeds, the rankings of the top 10 supercomputers remained unchanged in November from the previous rankings released in June.

Where Titan will rank in the list is anybody’s guess, Graham said. “You never know what the Japanese and Europeans will come up with.”



Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above