System X designers beat the odds
- By Joab Jackson
- Jul 14, 2004
Srinidhi Varadarajan: G5 supercomputer architect
Srinidhi Varadarajan, an assistant professor of computer science at Virginia Polytechnic Institute and State University, helped make supercomputing history last year.
The $10 million Mac OS X supercomputer he designed, nicknamed System X, outperformed systems costing hundreds of millions more.
His secret ingredient: 1,100 off-the-shelf Apple Power Mac dual-processor G5s. The design raised eyebrows, because Apple Computer Inc. had never considered itself a maker of supercomputing components.
System X is currently ranked by Top500.org as the third most powerful computer in the world. Virginia Tech uses it for systems modeling, computational chemistry, biochemistry, nanoscale electronics and other research problems. The G5s meanwhile have been replaced by Apple Xserve units, which require less cooling and electricity.
Varadarajan received a bachelor's degree in electronics and communications engineering from the Regional Engineering College of Warangal, India, and a Ph.D. in computer science from the State University of New York at Stony Brook. The National Science Foundation has given him a Faculty Early Career Development award.
GCN associate editor Joab Jackson interviewed Varadarajan at his Blacksburg, Va., office by phone. GCN: You often speak at conferences about the increasing difficulty of testing new supercomputing technologies. Why is it harder when there are so many more supercomputers today?
VARADARAJAN: The large supercomputing facilities at the Energy Department, NASA and National Science Foundation centers must be perfectly stable as a production resource.
If you want to improve performance or try new programming models or memory management techniques, the machine will necessarily become unstable. If it crashes, you are in trouble. And you need to test on very-large-scale systems.
Simulation alone is not sufficient.
In the late 1980s and early 1990s, we had access to machines where we could do testing, but then they disappeared.
The scalability problems you see with systems of 2,000 processors are considerably different from those you see with 32 processors.
When you are writing an algorithm today, you have to rely more on intuition with no experimental evidence to back it up. You don't know if the implementation is really scalable or not. So System X is intended to operate with both kinds of code.GCN: When you were planning System X, how did you arrive at using Apple Power Mac G5s? Did you look at other 64-bit chips'the Intel Itanium 2 or Advanced Micro Devices' Opteron?
VARADARAJAN: We first looked at a Dell Inc. Itanium 2 solution. We got the architecture nailed down, but the machine would have cost about $12 million. We didn't have $12 million.
Dell pulled out at the last minute, less than a day before the memorandum of understanding ended. We tried to work with Hewlett-Packard Co. on another Itanium solution, but the same thing happened.
So the next thing we did was work with IBM Corp. to get a PowerPC 970 solution. The PowerPC 970 processor was considerably better for our applications, but the clock rates it could support were fairly low.
IBM was building the chassis for data centers, where you can put two of them back to back in a rack. The problem is, the amount of heat generated that way is phenomenal.
For us, space was not much of an issue, but thermal heat definitely was. You could cool one rack by putting an air conditioner right in front of it and blasting air through. But if you have 50 racks, you cannot buy and operate 50 air conditioners'it is cost-prohibitive to do that.
So we worked with IBM trying to get an Opteron solution. The problem with the Opteron was its floating-point unit. Our applications are largely scientific, and they do a lot of multiplication and addition. The PowerPC 970 and the Itanium 2 each can do both a multiplication and an addition within one cycle of throughput. The Opteron can't do that.
So, essentially, the Opteron design became very large. One dual-Opteron machine is cheaper than a G5 platform. Two of them are not.
That is what caused that particular solution's cost to go beyond $10 million.
Right about then, Apple announced a PowerPC 970 platform. We went to see what these nodes looked like. The motherboard was very well-designed. Everything matched. The memory management was excellent. There were two CPUs'the fastest 970 processors. The price point was perfect. It was a completely Unix-based system. Right then, we knew we could do this.
So we called up the local Apple representative to see if he could get an order placed.GCN: What did the sales rep say when you said you wanted to order 1,100 of the G5s?
VARADARAJAN: His reaction was, 'We need to talk to headquarters.'
Cupertino said, 'Look, you need to come here and talk to us.' So we took a flight on Sunday night, met all day Monday and flew back Monday night. We spoke to senior management to convince them that it would work. They asked us for 24 hours to make up their minds.GCN: What was their concern?
VARADARAJAN: The G5 was Apple's leading product. It had never been used in a project of this size'fairly high-profile, so it could have led to marketing nightmares.
Everyone was telling us that such a system wouldn't work. There were too many risks.
We were trying out a communications fabric that had never been tried before at this size. We had a brand-new cooling system design. We had no real history of doing supercomputing before. Pretty much everything stacked up against success.
The plan made perfect sense to us, but it was very hard to convince others that this was doable.GCN: So obviously Apple did decide to provide the computers for you.
VARADARAJAN: We got the first machines coming off the assembly line, and that was our other big risk factor. The first application we ran was a benchmark, which is very taxing. It uses all processors at full capacity for hours at a stretch.
The first benchmark result was a little disappointing. We optimized the system over a five-week period. We started picking up pace after Oct. 10 or so.
The first number we reported, on Oct. 1, was only 800 billion floating-point operations per second. By Oct. 11, the machine was at 7 trillion FLOPS, and every day it was producing another 10 percent more.
We had to do some fairly hectic work getting the optimizations correct'work inside the operating system, numerical libraries and system management groups.GCN: You've said that one of the unforeseen problems with clusters is the uptime of individual machines.
VARADARAJAN: If you have a node that goes down even once a year, and there are 1,100 nodes, that means 1,100 failures a year'that's three a day. And that is fairly normal for systems of this size.
Systems on a big scale need fault tolerance, which means recovering from failures without ever causing applications to be affected. The [management software we developed] is both operating system-independent and hardware-independent. It recovers transparently from a failure.
Here is how the fault tolerance operates: Say you have 1,100 nodes and 10 hot-spare nodes. When there is a failure, your application migrates to one of the hot spares. You never see the failure.
Later, when one job terminates, it returns a node back to the hot-spare pool. You always have 10 hot spares. When the system reaches a certain threshold of, say, five bad nodes, you call in a systems administrator to fix the bad nodes.
We're commercializing that software now. I am the chief technical officer of California Digital Corp., headquartered in Fremont, Calif. We have a software development center in Blacksburg.GCN: How did you first get interested in IT?
VARADARAJAN: I was in Gujarab, India, at the time, in the eighth grade. I wanted to play video games but my parents wouldn't let me have any. So I learned Basic to program video games. I used a British computer called the BBC Micro [from Acorn Computers Ltd. of Cambridge, England].
To this day, I think the Micro is one of the best machines to teach programming on, because it's simple.
By the tenth grade, I knew what was going on inside the machine'what the processor was, what the memory hierarchy was and everything else associated with it.
But when you look at a modern PC, it is fairly intimidating. It doesn't seem like something you could learn to program in six months.