How to increase access to supercomputers
Limited access to high performance computing (HPC) resources is hampering the nation’s ability to maintain its competitiveness, slowing the adoption of artificial intelligence into defense applications and impeding innovation that can benefit society, such as health care or environmental research.
That assessment comes from the Center for Data Innovation, which outlined how the U.S. can ensure the supply of HPC resources keeps up with demand in its recent report, “How the United States Can Increase Access to Supercomputing.”
HPC resources are spread across the federal, industry and academic sectors, with the Department of Energy consuming the lion’s share of federal funding for a relatively small number of research projects. The National Science Foundation supports “the long tail” of researchers trying to access a shallower pool of supercomputers. Neither agency has been able to meet its current demand, which is three times greater than current capacity, the report said.
The Center for Data Innovation called on Congress to increase total NSF investment in HPC infrastructure to at least $500 million per year and boost total DOE funding in HPC infrastructure to at least $1.5 billion per year. Additionally, it said, NSF should analyze where current awards are going, and prioritize funding for states conducting high levels of artificial intelligence research but which have low levels of HPC availability. Access for groups traditionally under-represented in science and engineering -- women and minorities -- should be encouraged, and creating a trained HPC workforce is essential, according to the report.
Cloud-based access to HPC resources should also be encouraged, the report said.
Cloud platforms, such as Amazon Web Services, Microsoft Azure, and Google Cloud, give users as-needed to access computing power, storage and databases through internet-enabled network interfaces, rather than requiring those researchers to work at a specific facility. Additionally, cloud-based HPC resources are elastic. They can easily accommodate workloads needing just a single virtual GPU or those requiring a substantial portion of the entire cloud. The cloud’s ability to scale to match supply and demand avoids bottlenecks caused by too many workloads and wasted resources from too few, the report said.
Yet while cloud resources can help alleviate the shortage of HPC resources, it is not suited for all workloads, the center said. Highly parallelized applications work well in the cloud, but those that require independent nodes to communicate with each other run more efficiently in on-premises systems that have faster internal networks. Additionally, workloads like large simulations that require significant data transfers can rack up costs quickly.
Because maximizing returns on HPC investments will require careful resource management DOE and NSF should require funded institutions to adopt HPC auditing tools, the report said. NSF, meanwhile, should publish roadmaps of its priorities and ensure its strategic decisions on resource allocations reflect user requirements.
Read the full report here.
Connect with the GCN staff on Twitter @GCNtech.