4 'smart' lessons from the Great Southwestern Blackout
The Great Southwestern Blackout of 2011 is full of lessons for critical infrastructure protection. We don’t have a lot of details as of this writing about what Arizona Power Service is calling an “employee-generated event,” but it demonstrates that the nation’s power grid is interconnected and fragmented, fragile and resilient, and that we don’t have to wait for our enemies to attack us.
The lights began going out about 3:30 p.m. Pacific Time on Sept. 8, when the North Gila–Hassayampa 500 kilovolt transmission line near Yuma in the southwest corner of Arizona tripped off line, according to APS. The resulting power outage cascaded into Southern California and northern Mexico, leaving more than 5 million people without power at its peak.
Power had been restored to most if not all customers by dawn Sept. 9, although a nuclear power plant that generates power for San Diego remained offline after an automatic shutdown.
Smart-grid dividend: security and intelligence already built in
Secure the smart grid or face 'serious consequences,' Chu says
“The outage appears to be related to a procedure an APS employee was carrying out in the North Gila substation,” APS said in a statement. “Operating and protection protocols typically would have isolated the resulting outage to the Yuma area. The reason that did not occur in this case will be the focal point of the investigation into the event, which already is under way.”
Lesson No. 1: The threat is not always from outside and usually is not malicious. In this case, it was an employee carrying out a procedure. We don’t know for sure what that means, but we can probably assume the procedure was not malicious and the employee should have known what he or she was doing.
Better training and oversight, and better policies and procedures, might have prevented this event. But all the defenses and policies we can muster will not be able to protect us from a determined insider looking for shortcuts or responding to unexpected situations.
The bottom line is, we have to assume these things are going to happen and be ready to respond to them. In this instance, APS and other area systems seem to have done a pretty good job of responding.
Lesson No. 2: Complexity is an enemy. Protocols should have isolated the outage, but for some reason it spread. The multiple links between hardware, software and power flows in the interconnected local, regional and national grid systems make it difficult to predict and prevent exactly what will happen in every possible event. Once again, we must expect the worst and be ready to respond.
Lesson No. 3: Complexity also is our ally. If a system is so complex that those who design and operate it are unable to predict exactly what will happen during an event, it would be even more difficult for an intruder to plan a widespread disruption.
We know that the power grid is vulnerable to outside cyberattacks. Malicious code reportedly has been found in some systems, apparently monitoring and waiting.
But the Southwestern blackout underscores the difficulty of engineering a disruption across multiple systems. Yes, it could be done, but the farther an incident spreads, the less predictable it becomes, making it possible that the results of a serious attack would considerably limited. I would not recommend security through complexity as a strategy, but we should be aware that shutting down a complex, fragmented and resilient system can be more difficult than it seems at first glance.
Lesson No. 4: A Smart Grid would create both new weaknesses and strengths. There has been a lot of discussion about the need to secure an intelligent grid that would be networked to enable two-way flow of information and power. IP technology, Internet connections and millions of new user endpoints in homes and businesses around the country will greatly expand the attack surface of a system that already is vulnerable. With development and implementation of Smart Grid components now under way, improving security is essential.
At the same time, a better flow of information will increase situational awareness, helping to spot threats and attacks before they do real damage and to identify and isolate incidents — malicious or accidental — as they occur.
APS already is implementing some Smart Grid technology, installing automated network switches and remote monitoring equipment to help improve responsiveness and reliability of its system, and installing 36,000 digital smart meters at customer homes and businesses throughout Flagstaff in 2009.
If a complete Smart Grid infrastructure were in place now, it might have been easier for the company to isolate the Sept. 8 outage. At the very least, it could make it easier to find out after the fact what went wrong.