Amazon cloud crash keeps Energy site offline
Collaboration site restored after being dark nearly two days
- By Kevin McCaney
- Apr 22, 2011
This story has been updated from its original version to include the fact that the site has been restored.
A cutting-edge Energy Department collaboration website hosted by Amazon’s cloud services was unavailable for nearly two days after losing service from an Amazon data center, offering one example of what can go wrong with cloud computing.
The outage, which hit Amazon’s Elastic Compute Cloud (EC2) data center in northern Virginia early in the morning of April 21, continued well into the next day for some of the sites affected, although others had come back online.
Energy’s OpenEI.org site, a Semantic Web site that invites public participation in clean energy research, remained offline into the evening of April 22. It was restored late that night. While it remained unavailable, a note posted over the home page said: “Unfortunately, OpenIE’s data center, Amazon EC2, is temporarily down. We are working aggressively to restore service as soon as possible, please check back soon.”
Is Amazon’s cloud crash a cautionary tale for government?
DOE launches collaborative platform for energy data
OpenEI apparently was the only government site that went down. Recovery.gov, which tracks stimulus spending, is hosted at EC2’s Virginia center, but was not affected because is was set up to move to another location in the event of an outage, NextGov reported.
The Treasury Department’s home page and several other public-facing sites also are hosted by EC2, but stayed in operation.
A number of popular Web 2.0 sites, including Reddit, Quora, Foursquare and HootSuite, also had been hit by the outage, which started at 4:41 a.m. Eastern time Thursday, April 21. By Friday morning, Quora, Foursquare and HootSuite were operating again, although Reddit was still in a limited, read-only mode, in which users could still not log in.
Updates on Amazon Web Service’s status dashboard attributed the outage to re-mirroring among Elastic Block Storage volumes that ate up capacity and made it difficult to create new volumes. Shortly before noon Eastern Time Friday, Amazon said lost services continued to be restored, although full recovery would take time.
“We continue to see progress in recovering volumes, and have heard many additional customers confirm that they're recovering,” according to the status update. “Our current estimate is that the majority of volumes will be recovered over the next 5 to 6 hours. As we mentioned in our last post, a smaller number of volumes will require a more time consuming process to recover, and we anticipate that those will take longer to recover.”
Energy launched OpenEI — for Open Energy Information — in December 2009 on an Open Linked Data platform to give the public access to department data and allow contributions. It takes the Semantic Web approach of using Uniform Resource Identifiers to enable machine-to-machine searches and improve collaboration.
The site, aimed at developing clean energy resources, pulls data from the National Renewable Energy Laboratory and other national laboratories.
Kevin McCaney is a former editor of Defense Systems and GCN.