Shaw AFB gets help in bringing lost data back from the grave
- By William Jackson
- May 29, 2009
The storage system at Shaw Air Force Base, S.C., had been working fine for years and had a tape-drive backup, so Dean Johnson wasn’t overly concerned when it started showing its age last year.
The 20th Civil Engineer Squadron at Shaw, which supports the base’s physical infrastructure, was using a seven-year-old Adaptec 2600 series SCSI Redundant Array of Independent Disks (RAID) controller with a 12-disk array to store data about the squadron’s training, base housing records and other critical activities.
“Adaptec started having its problems,” said Johnson, the squadron’s network administrator. “It started glitching a little bit once in a while.”
The system, an assembly of a dozen 73G drives, was still usable and the data was being backed up to a robotic tape-drive system from Qualstar, so the occasional glitches were not a big cause for worry.
“Then one day it went down and didn’t want to come back up,” Johnson said. “The controller could see the drives, but it couldn’t see the array.”
When he checked the backup, “I was horrified to find that the server it was supposed to be backing up to had gone off-line and no longer existed to the backup system.”
Stuck with a system that couldn’t function and no backup, Johnson called Adaptec, only to find that the company no longer supported the controller. Once the controller failed, he was told, “there is nothing you can do.”
Some of the information contained on the storage array, including certificates of completion for required training, was essential to the 15 organizations in the squadron.
“If you can’t produce the certificate, you have to do the training again,” Johnson said. “In the fire department alone that would have been at least 490 hours of training for a couple dozen people.” The system also stored records needed to account for the base’s housing activities and expenditures. “We had all sorts of data on that machine.”
Adaptec told him his best recourse was DriveSavers Data Recovery, a company in Novato, Calif., that, as its name implies, specializes in recovering digital data.
“We recover data off of any form of media,” including tapes, disks and solid-state drives, said Michael Hall, DriveSavers’ chief information security officer. “If you can write a 1 or a 0 to it, we can recover it.”
So Johnson shipped the controller and drive assembly to California and hoped for the best. He said he believed the storage system was a RAID 5 array, which would have been good news. “If it was 5, it would have been a quick restore,” he said, because RAID 5 includes redundancy to assist in recovering data from lost or missing drives.
A RAID can be configured in a number of ways to determine how data is written to it, the most common configurations being RAID 0 (striped disks) and RAID 5 (striped disks with parity). RAID 0 distributes data across the array to give improved speed and full capacity. But all data on all disks will be lost if any one disk fails. RAID 5 combines three or more disks to protect data against the loss of any one disk. If one disk fails, the storage capacity of the array is simply reduced by one disk.
In California, Hall initially found that the news wasn’t good for the Shaw array. “I was looking at it and quickly determined it was not a RAID 5,” he said. It was a RAID 0, and one of its original drives had been replaced, jeopardizing the recovery of the system’s data.
That type of confusion is not unusual, Hall said. “Dean is a classic example. He thought he had a RAID 5,” but he didn’t.
It helps to be able to get as much information as possible from the drive owner when trying to recover data, Hall said. “The more we know, the better it is.” But it is not necessary. The company has recovered data from drives that have been run over by cars and buses, been burned in buildings, thrown out a third-story window, and once from a laptop PC that sank in the Amazon during a cruise, he said. The owner of the waterlogged laptop, a grad student, retrieved it with scuba gear and turned it over to DriveSavers to recover a graduate thesis on the drive.
Just about the only time it is impossible to recover data is after a catastrophic electro-mechanical failure — “in industry terms, a head crash,” Hall said. In that case, the medium is physically damaged, and the data has been scraped away.
Hall said that in the past 10 years, there has been a trend toward more electro-mechanical failures, although most of them are not catastrophic.
“When I first got here, about 20 percent of what we saw was electro-mechanical failure and 80 percent was logical,” he said. Today that ratio is reversed, partly because a growing number of tools are available to help owners recover from logical failures, so the professionals see fewer of them.
But the change has also happened because of the increasing precision and tighter tolerances of drives as their capacity grows into the gigabyte scale. “They have learned how to cram as many bits as possible onto the surface,” Hall said. “They are really engineered tightly.”
But of all the problems the company sees, “the most complex type of data recovery is a multiple disk array,” Hall said. “You’re ganging a lot of drives together and asking them to act as one at the logical level. It can be quite complex.”
The first step in recovering the Shaw data was to find the original drive that had been replaced in the array. Often, drives are replaced because they’re damaged, so owners don’t hold onto them, Hall said. But Johnson had a spare drive in the office that he thought might be the missing one. “It was sheer luck that he still had it and it happened to be the correct drive,” Hall said.
DriveSavers reinstated the original drive and made a sector-by-sector image of the array. “At that point, I was able to rebuild the RAID 0 and get the data,” Hall said.
The data was recoverable because “we’re dealing with the device at the physical level, not the logical level,” he said. The technicians never work with the original drive but with a bit-by-bit target copy they examine using the operating system the data was recorded with.
From about 200G of data stored on the array, Johnson handpicked the files that were essential to his squadron.
“I took the pick of the litter,” he said. “They recovered roughly 45G of data that was absolutely critical” and sent it to him on a portable USB drive.
“It was a relatively painless experience,” Johnson said.