Second-guessing Irene: Could 120-petabyte array make better predictions?

Predicting the outcome of severe storms is one of the hardest jobs there is, based on complex models with many variables. The difficulty was evident with Hurricane Irene, which wound up being less severe than forecasters predicted.

Perhaps the weak point isn’t computing power or human oversight. Perhaps it’s simply storage.

Related coverage:

Hurricane season? Really? FEMA widget can help you prepare.

The amount of data storage needed to run the sorts of simulations necessary to help predict weather patterns is, quite simply, really huge. Certainly it’s larger by degrees of magnitude than is currently needed for a desktop computer application or even basic storage for an entire network.

IBM researchers have announced that that they are now able to link more drives together than ever before – 200,000 to be precise – into one giant continuous drive. Individually, these drives are just your ordinary serial-attached small computer system interface (SCSI) drives, but when IBM puts them together they yield 120 petabytes of storage.

If you haven’t been working with long-term, back-up storage, you might not be familiar with that prefix. Well, you know what a gigabyte is — your desktop’s hard drive and even your key drives are likely quantified in this unit. A petabyte is more than 1 million gigabytes (actually, 1,048,576 – binary, remember?). That’s right – there are as many gigabytes in a petabyte as are there are kilobytes in a gigabyte.

For some real-world perspective, imagine you are creating the most fantastically impressive slideshow presentation. Dozens of slides, graphics, a little animation, the works – you know, the kind of file that your network administrator yells at you about because it’s taking up too much storage space and no, you cannot e-mail it to anyone. Well, 120 petabytes can hold about 1.5 million of those.

IBM’s research lab in Almaden, Calif., is building the system for an unidentified client that is planning to do supercomputer simulations of real-world events, according to Technology Review.

And that kind of power also could be applied to generating more accurate weather predictions.

There are a lot of concerns about combining that many discs, not the least of which is failure rate. With even the best equipment, after a while there becomes a small chance at least one drive in an array will fail each and every day.

Under this arduous environment, basic redundant array of independent disk (RAID) architecture simply wouldn’t cut it. Even the most secure RAID structure in common usage, RAID 6, has a fault tolerance of two disks, meaning if three drives failed before any were replaced, you’d be out of luck.

So IBM had to come up with its own brand of redundant disc array. The researchers aren’t sharing any details, of course, but it involves lots of replacement disks already mounted and software that speeds up a restoration process when it detects additional drives failing. IBM claims this storage system wouldn’t lose any data in a million years of constant use, with no loss in performance. It’s nice to know that the machine can outlive our children’s children’s children without further maintenance.

Now this is all well and good, but the issue that people had with Hurricane Irene was not in the accuracy of the predictions, but in the accuracy of what they were told to expect. Storm predictions are always a combination of two things. The weather services never want to potentially underplay a storm’s strength for the sake of the lives it could cost if they were wrong in the other direction. Also, the media needs to drive their viewer numbers and does this by overstating the danger at times.

So even if this new technology will help to more accurately predict the danger a storm presents, what the public will hear will always be worse.

About the Author

Greg Crowe is a former GCN staff writer who covered mobile technology.

inside gcn

  • AI regulation

    Congress takes first steps toward regulating artificial intelligence

Reader Comments

Thu, Sep 1, 2011 earth

Funny, nuclear power stations aren’t supposed to melt down more than once every million years. And yet they average one a decade. Maybe IBM needs to talk to Google. What’s their storage capacity?

Honey, where did you say you saved my recipe file?

Thu, Sep 1, 2011 earth

Dang. And I can’t even get my boss to install a water transfer, evaporative cooling system in the server room. Maybe it was the picture I sent along with the request to demonstrate the maturity of the technology. (www.hottubs.com).

Seriously, could they use high temperature solder on those hard drives. Run the whole array at about 150 Centigrade and you could recover some of the waste heat. Otherwise you might get small tornados forming over the heat exhaust. Of course you could study that model as well.

Thu, Sep 1, 2011 andy USA

Mr. Crowe, what exactly is your point? The linkage between weather prediction and an IBM proprietary solution you present is weak at best. Is anybody actually considering this IBM R&D activity as a potential solution to what may or may not actually be a problem? You cite no sources claiming that this solution would be at all useful for meteorological analysis. If this is just an IBM fluff piece, fine. They achieved something interesting which certainly merits some exposure. Tying it to a recent natural disaster without a shred of supporting evidence, however, is the highest form of junk journalism.

Thu, Sep 1, 2011

What SuperComputer did IBM use to arrive at 0% failure rate in a million years?

Thu, Sep 1, 2011 Homer

How many gallons arew in a bunch?

Show All Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group