Big data just got a little smaller
- By John Breeden II
- Jul 19, 2012
Although it’s not gotten the kind of press it probably deserves, a new invention could completely change the way we all work with and use computers. Scientists from Carnegie Mellon University have apparently come up with a way of configuring hard drives that could turn even average systems into near super-computers for calculations involving big sets of data.
I know it sounds a bit like cold fusion, and even I had a hard time believing it, but all the evidence seems to be on the level. And if so, the technology could probably help government agencies that wrestle with the problems associated with big data.
According to Carlos Guestrin, who runs the lab there, one of his students named Aapo Kyrola found a way to make hard drives more efficient. That may not seem like much, but the results speak for themselves. With this new program, called GraphChi, the traditional bottlenecks associated with hard drives -- physical spin speeds, network bus limitations and transfer times -- have been overcome.
To test out GraphChi, their lab decided to analyze Twitter’s entire social graph, which contained 1.2 billion connections from 40 million users. The last time this was done, it took 1,000 computers clustered together for processing the data 400 minutes to complete. When the same calculation was run on a modest Mac Mini running the software, it took just 59 minutes.
It sounds crazy that a single computer can do the work of a chained supercomputer with 1,000 nodes in a quarter of the time. But where most computers have to load data from their hard drives into RAM, thereby going through every bottleneck associated with a hard drive, with GraphChi, it’s all done on the hard drive itself. In that model, the bigger the hard drive, the more powerful the computer, to a point anyway.
While this is an amazing development, keep in mind that this program was written for a specific function But if this technology could be extracted for general use, it would really change the world. No longer would average users have to upgrade their systems every couple years. And what of older PCs? They could probably get a huge performance boost when using GraphChi-like technology. And to get all this with a software upgrade is simply amazing. Kudos to the team at Carnegie Mellon. Our soon-to-be much faster PCs thank you.
John Breeden II directs the GCN Lab.