- By Patricia Daukantas
- Jun 19, 2002
How State recovered from a server break-in
State's Victor E. Riche says, 'This has a happy ending. I'm still in the job I had last year.' He and his staff worked 10 weeks undoing a hacker attack.
(GCN Photo by Olivier Doulier)
When recovering from a Web site hack, agency managers should keep employees informed, prohibit IT staff from working around the clock and recognize their efforts once the job is done.
Those are some of the lessons from a May 2001 attack on one of the State Department's networks, said Victor E. Riche, managing director of the IT office within State's International Information Programs Office and Educational and Cultural Affairs Bureau. At the FedWeb 2002 conference last month in Bethesda, Md., he gave a rare insider's glimpse into his department's 10-week recovery.
'This has a happy ending,' Riche quipped at the start of his presentation. 'I'm still in the job I had last year.'
Riche manages 35 federal employees and 50 contract workers who handle a network, dubbed PDNet, with about 900 PCs and 40 servers and links to about 130 embassies around the world.
On May 8, 2001, Riche was at a luncheon when he learned of an intrusion on two of his Web servers.Where the buck stops
Officials at State's bureaus of Diplomatic Security and IRM recommended taking down the network to which the servers were attached. Riche and other IRM officials went up the chain of command to get permission to do so, all the way to the secretary of state himself.
'Colin Powell made the decision to take down the network that I manage,' Riche said.
Riche described the first week as 'isolation and despair' because 'we wondered if we were ever going to be right again.'
As it turned out, the intruder left only some foul language on the servers. As far as State officials could determine, the visitor neither stole nor tampered with any files. But the recovery process was just beginning.
During the second week, officials drew up plans for extensive hardware and software upgrades.
The international information programs and cultural affairs offices had been part of the former U.S. Information Agency, which State absorbed in 1999, and they had been using a Novell NetWare network with desktop clients running Microsoft Windows 98. State uses Microsoft Windows NT exclusively, Riche said.
Over the Memorial Day weekend, three weeks after the incident, Riche and other IT workers removed 300 PCs from the international information programs office and copied their hard drives onto new PCs. They also split the network into old and new environments and got most of the e-mail accounts working again.
During the following week, employees kept reporting problems with their e-mail accounts.
'The funny thing is, within the new environment, the Internet was back, but people weren't that wild about it,' Riche said. 'They wanted their e-mail back.'
During Week 5 of the saga, Riche's staff was alerted in the middle of the night because the air conditioning had gone down in the server room.
Also, the new e-mail system clogged up and the IT staff couldn't understand why, Riche said. At one point more than 20,000 undelivered messages swamped the queue.
The following week, Riche and his staff 'took apart what the Microsoft experts built, and the e-mail started to go,' he said. The workers also patched the air-conditioning system with garden hoses.
Seven weeks into the recovery effort, techs moved to the cultural affairs office and copied and replaced the PCs there, Riche said.
The ninth week of the recovery included the Independence Day holiday. 'I finally got some of my staff to take three vacation days,' Riche said.
On July 12, 10 weeks after the attack, Riche declared the recovery job finished.
E-mail was working well.
During July and August, State officials conducted "Operation Clean Sweep," a Web survey about the site recovery that generated more than 350 responses. They took care of every complaint. In early September, Riche held an appreciation party for everyone who had worked on the recovery.
Riche said the incident reminded him of the importance of good communication and listening skills. 'You have to listen to what everybody is saying, not just your managers,' he said.
Keeping both offices' workers informed was extremely important, Riche said. His staff issued about 40 so-called PDNet Alerts over the 10 weeks of recovery, sometimes handing out paper copies to employees whose e-mail was down.
Food also was 'absolutely essential' to staff morale, Riche said. On Father's Day, the office manager came in to cook waffles for the fathers who had to work.Sunny side up
'Managers need to stay optimistic or else the staff won't have a can-do attitude,' Riche said.
Riche said he didn't force anybody to sign up for night or weekend duty. When he started to notice how tired his workers were, he ruled that no one could work more than 12 hours per day without his approval. After that long, most people are not very effective, he said.