Big data just got a little smaller

Although it’s not gotten the kind of press it probably deserves, a new invention could completely change the way we all work with and use computers. Scientists from Carnegie Mellon University have apparently come up with a way of configuring hard drives that could turn even average systems into near super-computers for calculations involving big sets of data. 

I know it sounds a bit like cold fusion, and even I had a hard time believing it, but all the evidence seems to be on the level. And if so, the technology could probably help government agencies that wrestle with the problems associated with big data.

According to Carlos Guestrin, who runs the lab there, one of his students named Aapo Kyrola found a way to make hard drives more efficient. That may not seem like much, but the results speak for themselves. With this new program, called GraphChi, the traditional bottlenecks associated with hard drives -- physical spin speeds, network bus limitations and transfer times -- have been overcome.

To test out GraphChi, their lab decided to analyze Twitter’s entire social graph, which contained 1.2 billion connections from 40 million users. The last time this was done, it took 1,000 computers clustered together for processing the data 400 minutes to complete. When the same calculation was run on a modest Mac Mini running the software, it took just 59 minutes.

It sounds crazy that a single computer can do the work of a chained supercomputer with 1,000 nodes in a quarter of the time. But where most computers have to load data from their hard drives into RAM, thereby going through every bottleneck associated with a hard drive, with GraphChi, it’s all done on the hard drive itself. In that model, the bigger the hard drive, the more powerful the computer, to a point anyway.

While this is an amazing development, keep in mind that this program was written for a specific function But if this technology could be extracted for general use, it would really change the world. No longer would average users have to upgrade their systems every couple years. And what of older PCs? They could probably get a huge performance boost when using GraphChi-like technology. And to get all this with a software upgrade is simply amazing. Kudos to the team at Carnegie Mellon. Our soon-to-be much faster PCs thank you.

About the Author

John Breeden II is a freelance technology writer for GCN.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected