Reconfigurable chips could accelerate HPC
SGI's Reconfigurable Application-Specific Computing server
New systems from Cray, SGI offer performance boost but require expertise
The Cray XD1
At first glance, the new Cray XD1 that showed up on the loading dock of the Naval Research Laboratory last summer looked like a regular supercomputer. But buried inside the 2.5-TFLOP machine was something genuinely new'microchips with seemingly magical powers to improve performance.
The magical chips are called field-programmable gate arrays and'at least for certain mathematically intensive processes'promise to boost performance fortyfold or more. What's more, the chips pave the way for reconfigurable computing, a process that allows the owner to modify the chip to meet the specific needs of an application. Although FPGAs require an investment in program modification, vendors say, the payoff could be enormous.
Two of the top supercomputing companies recently introduced FPGA-based components that users can add on to their supercomputers. In September, Silicon Graphics Inc. introduced its Reconfigurable Application-Specific Computing (RASC) server, an add-on unit that an administrator can plug into an SGI system to boost performance. The company is following the lead of Cray Inc. of Seattle, which made FPGA an option for the mid-range XD1 supercomputers it introduced a year ago.
According to observers, the entrance of these supercomputer heavyweights might finally bring the powers of FPGA computing to a wider audience. FPGAs have long been embedded in electronic appliances, such as automobile subsystems that give dashboards their smarts. But for general computing, FPGAs have only been offered on plug-in cards, or through dedicated systems offered by smaller companies such as SRC Computers Inc. of Colorado Springs, Colo.
The government has been an early adopter of FPGAs in supercomputing. SGI's product was actually spawned by government work, said Bill Mannel, director of marketing for SGI's server and platform group. A defense agency had asked the company to outfit an SGI Altix system with customized FPGA chips. One of the chips could do the work of 144 regular processors.
'The interest in the technology grew, so that about a year and a half ago, we decided to put our own research dollars into creating a product,' Mannel said.
The current commercial release of SGI's RASC, which costs about $40,000, runs a single II-6000 FPGA chip, from Xilinx Inc. of San Jose, Calif. The company claims that early beta testers have executed a task 79 times faster with this brick than they could have using an Intel Itanium chip.
'So what we are saying is that you can go 79 times faster or remove 79 processors,' said Ron Renwick, SGI's product manager for reconfigurable computing.The Navy gets more super
When the Defense Department's High Performance Computing Modernization Program ordered an XD1 from Cray last summer, it also specified that the FPGAs be added. The machine consists of 24 blades. Each blade contains 12 AMD Op- teron dual-core processors and, embedded on a daughter card, six Xilinx Virtex-II Pro FPGAs.
The Naval Research Laboratory will use the XD1 supercomputer mostly for classified work. Which brings us to this: FPGA computing isn't for everyone just yet. Both Cray and SGI are looking at specific markets for their FPGA-based products.
'This is for customers whose applications spend a majority of time working on specific algorithms. We need to have a clearly defined algorithm that they can extract and port to the FPGA,' Renwick said. 'The FPGA needs a clearly defined task to work on.'
FPGAs do their best work when they do not have to handle branching executions or make a lot of calls to memory. A well-defined algorithm simply takes one set of data, performs a series of actions on it'such as comparing it to another string'and submits its output to the central processor.
In government at least, there are a number of supercomputing applications that perform such operations en masse'such as encrypting and decrypting numerical keys; signal processing; or picking out the edges of a face, building or some other object in an image. Each of these operations could benefit from FPGA computing.
Where does the speed gain come from? The extra performance comes from the fact that an FPGA can be customized to work on a certain algorithm.
'An FPGA is a chip whose software can be changed by downloading software into it,' according to Amar Shan, product manager for the Cray XD1. Unlike standard microchips, whose functions are fixed at the factory, the transistors within an FPGA can be reprogrammed. Each chip has a layer of memory that can hold programs to organize transistors for specific applications.
'It gives you the performance you'd expect from hardware rather than software,' Shan said. Manufacturers often customize chips'called application-specific integrated circuits'for specific hardware. But they are more expensive than FPGAs, even though they might be slightly faster. ASICs are also less flexible. An FPGA can be reprogrammed over and over to handle new and emerging applications.Paying the cost
FPGAs do have a downside, though. Applications that utilize FPGAs require more than a quick recompile. The programmer must carve out which parts of the chip will be used for which tasks. This chore can't be ac- complished with standard programming tools. In order to tweak the chip itself, skills in hardware circuit design are needed'skills that are more rarefied than programming chops.
This is the problem that Duncan Buell ran into over a decade ago. Now the interim dean of the University of South Carolina's College of Engineering and Information Technology, Buell worked for the Institute for Defense Analysis in the early '90s. The Defense Department commissioned the IDA to look into the use of FPGAs. Buell found that FPGAs offered many advantages but were rarely used because few shops had the skills to program them.
Even today, the problem remains, Buell said. Programmers are more abundant than electrical engineers, and hardware description languages such as Verilog and VHDL are not intuitive to programmers.
'VHDL tools and chip tools are for designing circuits, not for compiling code,' Buell said. He suspected that few managers today would invest in the training needed to get their staff up to speed in such interfaces.
Of the computer vendors now offering FPGA systems, SLC offers the best interface for programmers, Buell said. Its tools come closest to a typical environment for the C programming language. Other ef- forts are under way to simplify the process as well. An industry working group, called OpenFPGA, is developing open interfaces that can be used to simplify design.
SGI admits that hardware programming chops are needed for its new product. But the company is working to minimize the requirement as much as pos- sible, Renwick said. SGI bundles the RASC unit with a GNU debugger, a set of application programming interfaces and core services library. It also offers demonstration versions of FPGA developer software from Celoxica Ltd. of Oxfordshire, England, and Mitrionics Inc. of Culver City, Calif.
'Our ultimate dream is to [require] no experience in electronic design automation. Working with folks like Mitrionics eliminates a lot of that, but today you will need programmers with FPGA experience,' Renwick said.
Nonetheless, the company hopes that agencies will see the investment as worthwhile. With the performance gains that FPGA computing can offer, they may be right.