What’s behind most data center outages?

What’s behind most data center outages?

The rising cost of unplanned data center outages will further constrain already tight government budgets, increasing the need for carefully considered risk mitigation strategies and disaster recovery plans.

According to a recent study by Ponemon Institute, the average cost of a data center outage rose to $740,357 in 2015 -- an increase of 38 percent since 2010. The increase in the maximum downtime cost ($2,409,991) was even greater, climbing 81 percent over that same time period.

The most expensive cost was business disruption, followed by lost revenue and end-user productivity. IT productivity, detection, recovery, ex-post activities and equipment were next.

In light of that hierarchy of losses,  the public sector had the lowest cost ($476,000) for unplanned outages. Financial services, a heavily data-dependent industry, suffered the highest ($994,000) costs.

The study, which was sponsored by Emerson Network Power, a provider of information and communications technology infrastructure, polled 63 data centers in the United States that had experienced outages in the past 12 months.

Cyber crimes are the fastest-growing cause of data center outages, rising from 2 percent in 2010 to 22 percent of outages in the latest study. Uninterruptible power supply (UPS) failure continues to be the No. 1 cause of unplanned data center outages, accounting for one-quarter of all such events.

IT equipment malfunction accounted for only 4 percent of all outages. Water, heat or air conditioning failure accounted for 11 percent of outages, followed by weather at 10 percent and generator failure at 6 percent.

The more data-dependent industries had faster-than-average rising downtime costs. Unsurprisingly, the longer the outage, the greater the cost. Complete unplanned outages lasted on average 66 minutes longer than partial outages and were more than twice as expensive. Similarly, the larger the data center, the greater the cost of the downtime.

Human error was behind 22 percent of outages, the same as in 2013, indicating that no progress has been made to mitigate failures caused by workers, according to the report’s authors.

However, Ponemon’s reported high rate of human-caused downtime may be worse, as other industry watchers have argued that cyber crime and UPS system failures are ultimately caused by humans.

According to a 2014 report from IBM, over 95 percent of cyber crimes had human error as a contributing factor. Its 2015 report found that 55 percent of cyber threats were from people with insider access to a organization’s systems. And the biggest threat to UPS system error is from the operator, whose lack of training, misinformation, budget constraints could be underlying causes, according to an article by Quality Power Solutions, a UPS systems and generator provider.

The human factor was recently illustrated by the three hours of data center downtime that forced delays for JetBlue airlines. That failure was blamed on a power outage at a Verizon data center.

The root issue? Human error – Verizon’s disaster recovery plan either wasn’t in place or didn’t work.

That really struck a chord with me," Kelly Quinn, a research manager at IDC, told SearchDataCenter.  "My first thought was 'what was their disaster recovery plan and why didn't it work?'"

Kelly explained that human error is the No. 1 cause of outages in IDC's recent surveys into the causes of data center downtime.

About the Author

Kathleen Hickey is a freelance writer for GCN.


  • business meeting (Monkey Business Images/Shutterstock.com)

    Civic tech volunteers help states with legacy systems

    As COVID-19 exposed vulnerabilities in state and local government IT systems, the newly formed U.S. Digital Response stepped in to help. Its successes offer insight into existing barriers and the future of the civic tech movement.

  • data analytics (Shutterstock.com)

    More visible data helps drive DOD decision-making

    CDOs in the Defense Department are opening up their data to take advantage of artificial intelligence and machine learning tools that help surface insights and improve decision-making.

Stay Connected