Daniel Reed | It's software at the core

Interview with Daniel Reed, director of the Renaissance Computing Institute

It seems fitting that Daniel Reed is director of the Renaissance Computing Institute, because he is a bit of a Renaissance man himself ' at least when it comes to computing. Reed has proved influential in a remarkable variety of spheres. He advises the president as a member of the President's Council of Advisers on Science and Technology. He leads the Computer Research Association.

He's also overseen the National Center for Supercomputing Applications, and he was principal investigator and chief architect at the National Science Foundation's TeraGrid project.

The way Reed reckons, RCI's goal is to push beyond the traditional boundaries of computer science. It tackles tough computational problems not only with computer science but also with the aid of business and the social sciences. The institute was founded in 2004 by Duke University, North Carolina State University, the University of North Carolina at Chapel Hill and the state of North Carolina.

GCN: What does RCI do?

Reed: The highest priority is to look at how computer technology affects broad societal problems. It's really about bringing people together across disciplinary boundaries. I think a lot of problems we deal with in this decade lie at the intersection of multiple disciplines. Our role is to be a catalyst for innovation. And that spans everything from traditional computer science to supporting the humanities or performing arts.

There is advancing computing as [a goal], which is the basic technology research, and then there is application. And it is important to keep them coupled. For any given technology, there could be a wide variety of ways it could be advanced. Having a set of driving problems that defines opportunity will help push technology in certain directions.

One of the big problems we're addressing now is rapid population growth in environmentally sensitive areas. As an example, in North Carolina, we have a rapid coastal population growth in areas with fragile ecosystems [that] are susceptible to severe weather. How do we look at predicting the effect of hurricanes and storm surge in those areas? Our goal is to forecast what the impact is likely to be and where it will happen and ' in the longer term ' use that information to influence zoning and planning.

So to do all this, you need to bring together people who work for different organizations and live in different places and have different areas of expertise ' oceans and atmospheric science, computational modeling, state planning for disasters and logistics. We provide the computing capability, software and the people who will fill the holes among the groups.

GCN: It sounds like you have some insight on how to work across organizational boundaries. How do you unlock experts from their silos?

Reed: You have to recognize the individual agencies' missions. There are problems that you want to solve that don't fall within any particular agency's mission but depend on the agencies working together. So given the realities of their missions, you have to find the piece of collaborative solution that would also advance the mission goal of each agency. If you can do that, it's a win for them and, concurrently, you solve a problem that everyone can take partial credit for.

Working with the federal agencies, I've found that in many cases the senior folks are very much aware of these larger problems, and they are often very much pleased to see a group that can be a neutral ground for them to work together.

GCN: What are some of the fundamental problems that will bedevil computer science in the next few years?

Reed: It's the software. The challenge will be how to deal with the explosive growth of multicore processors. We will see a hundred-plus core chips soon, and those will be embedded in ever-larger systems. Petascale systems will have hundreds of thousands of cores. The software is not keeping pace with that.

The killer issue is how to exploit large-scale parallelism in a productive way.
The truth is, with commodity technology, any country in the world can occupy the top of the Top 500 list [of the world's most powerful supercomputers]. It is just a matter of political will and money. The hardware in some sense is just money ' sometimes large amounts of money, but it's just money. The United States has long had pre-eminence in that domain, but most of that technology is a commodity that anyone can buy. Exploiting that efficiently ' that is the hard part. We're in a race for macho-flops, we're not in the race for time-to-solution. And that is an entirely different metric, one that is tied up with building an integrated, well-balanced ecosystem of hardware and software.

On the government side, those problems take long-term investment. Over the years, we've launched many different projects to try to develop software to deal with parallelism in an automatic way, and we've never sustained these programs. We've killed them after a handful of years. The notion of saying this problem may take a decade to solve and that we'll have no success in the first 18 to 24 months is a hard sell.

The hope is [that] emergence of multicore [commodity processors] will force those issues out in the open, and the commercial market will have to deal with them, and that success will trickle up to the high-end computing.

GCN: You have said that software resiliency is another problem.


Reed: If we built biological organisms the way we built computer software, the first time a cell died, the organism would die. In system design, we presume that everything works correctly, by and large. Networking software is one of the few exceptions. But large, complex systems have unexpected kinds of behaviors and actions when components go wrong. You want the overall system to continue to operate reliably even when components fail. We don't have good mechanisms to design systems with those characteristics.

And, arguably, large software systems are the most complex creations that humans have ever done. They are far more complex than buildings ' even very large buildings. A 100 million-line software system is incredibly complex, and we still build those things in a craftsmen model. We don't have good organizing principles that will let us design them to meet certain specifications.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above