Supercomputing without the supercomputer
- By Chris Steel
- Sep 08, 2015
President Barack Obama this summer signed an executive order creating a National Strategic Computing Initiative with the goal of maximizing high-performance computing for economic and scientific research benefits. The order appears to be a response to China’s lead in the TOP500’s supercomputer performance ranking with its 33.86 petaflop Tianhe-2. The first objective of the order is to accelerate delivery of an exascale supercomputer that would be approximately 100 times faster than current petaflop machines.
As all agencies move toward real-time, data-driven execution, the need for high-performance computing will continue to grow. Unfortunately, the cost of HPC is not coming down in line with the need. So what can agencies do today to accelerate benefits from HPC while continuing to operate in a cost-cutting economy?
The answer is to find innovative ways to achieve HPC without the cost of supercomputers -- in effect, achieve supercomputing without the supercomputer.
Fortunately, transformative technologies exist to support the emerging class of power users who seek to perform massive data analytics with high data availability and extremely low latency -- at any scale. Some alternatives even allow agencies to use their existing assets to begin to achieve HPC-like capabilities without the need for investing in or time-sharing on a supercomputer.
The scale up or scale out conundrum
There are two ways to increase computing performance: vertically, by building a bigger supercomputer, or horizontally, by using many smaller systems in parallel to create a supercomputing cluster. The latter approach is what Google takes to implement its search engine because it allows the company to scale up bigger and faster than a monolithic supercomputer.
The benefit of scaling horizontally for agencies is that it allows them to use existing computing resources, thereby significantly reducing costs.
Today, many agencies are reaping the benefits of big data by taking advantage of horizontally scaling technologies such as Hadoop. Hadoop has a distributed file system and allows batch execution of jobs in a highly parallelized fashion. It is great for tasks that don’t need to be done in real time and can be easily broken into independent pieces of work that can run in parallel.
However, many types of computations do not lend themselves to this approach. They require more dependence on the related computations, or they are more serial in nature. Those applications demand a different type of horizontal scalability.
In-memory computing: The horizontal scaling alternative
The faster alternative to storage-based scaling solutions lies with in-memory computing technology. Processors can access data from RAM tens of thousands of times faster than from a local disk drive or a distributed file system. However, datasets used in HPC applications are often too large to fit in local RAM alone. They require a distributed memory cache, which is exactly what in-memory computing relies on.
By design, in-memory computing moves datasets to application memory, thereby reducing database transactions and delivering a much more responsive experience to the end user. The most compelling benefit is that in-memory computing allows users to achieve massive scale while relying on their existing computing architectures -- essentially turbocharging enterprise applications for increasingly demanding data-analysis tasks.
Agencies that are looking to HPC to address their real-time analytics needs might find that in-memory computing can fulfill many or most of those needs without an investment in supercomputing resources.
In-memory computing can also speed up agencies’ existing Hadoop analyses. Rather than relying on the Hadoop Distributed File System to push data out to the nodes and collect results, in-memory technologies can move data around to make it available to MapReduce jobs. That approach can speed up existing MapReduce jobs by a factor of 10 or even 100.
By combining that enhanced level of performance with Hadoop’s scalability to handle data volumes far beyond today’s supercomputers, many agencies can achieve the computing power necessary without the need for a supercomputer.
Although federal agencies must balance their HPC mission requirements with available funding and personnel, one practical system modernization option is to maximize the speed, value and scale of existing computing resources. For the highest performance in an existing agency computing architecture, consider in-memory computing -- a viable alternative to expensive supercomputers and a natural complement to current enterprise investments.
Chris Steel is chief solutions architect at Software AG Government Solutions.