Alabama court rules against its tape backup
- By John Breeden II
- Jun 18, 2014
The U.S. District Court for the Southern District of Alabama is the first level of the federal court system, hearing cases originating in that part of the country that might involve anything from drug trafficking to civil suits.
Programmer analyst P.J. Isbell runs part of the IT shop for the court. For the past several years, one of his responsibilities every morning has been monitoring and changing tape backup drives for the court's eight servers. "It was quite a tedious process," he said.
Additionally, the court’s manual tape backup was not very efficient. If a user needed to get a lost file back, for example, its location would have to be located on the handwritten labels on the tapes. Then the restore process could take a couple of hours – and that's if everything worked. The tapes were also prone to errors and damage from humidity, heat, wear and tear and other hard-to-control environmental factors.
When the court wanted to upgrade this antiquated backup system, it began to look at disk-based alternatives, which have come down significantly in price in recent years and which offer better data integrity and faster backups than tape. The court eventually selected ExaGrid Systems to provide the new system, based in part on the unique way the company handles storage.
ExaGrid CEO Bill Andrews explained that every company in the backup storage market makes use of data deduplication technology, which compresses data by replicating only those parts of a file which aren’t duplicated elsewhere.
Pointers are used to help systems reassemble the data when needed. Depending on the type of file being backed up, compression rates of up to 20 to 1 are possible with deduplication. It's generally a quick process because small organizations such as the district court only change about 2 percent of their data every week, and only the changes are backed up.
But even for smaller organizations, the technique can lead to longer backups and inefficient restore processes over time, something that ExaGrid appliances are designed to avoid, according to Andrews.
"Everyone else uses what is called in-line deduplication storage," he said. "They do the deduplication while the data is on the way to the disk. That means they need a very fast processor, and backups take longer to restore because everything has to be reassembled. They don’t store any non-deduplicated information on their disks, so it's unusable without taking it back through that process."
Instead, ExaGrid creates a unique landing zone that isn't deduplicated on every appliance. That means that if an organization has 10 terabytes of storage, the landing zone is also going to be that size.
Once the landing zone is in place, deduplication occurs within the ExaGrid system, and only data that has changed is deduplicated. Instead of taxing network resources every night to run lengthy in-line backup processes, all of that work is handled inside the appliance from the landing zone to the storage area.
The system can also run at any time, even during peak network crunches, as it’s a fully separate system. And because there are no backup storage time limits that need to be met, the ExaGrid appliance can run on slower processors, which are less expensive and which generate far less heat.
Restores are also quicker because they are made from the non-deduplicated landing zone when needed and perform like transferring a file from one drive to another. "When I was asked to pull a file from backup, it was dramatically faster and a lot easier than it ever was before," Isbell said. "I don’t miss tape at all."
ExaGrid has several storage models that vary depending on the size of the backup needed. The EX1000 offers a terabyte of storage arrayed on five disks, including 4G of memory and a quad core CPU to handle the deduplication process internally. It retails for $13,000, plus 15 percent per year in maintenance fees, including product and version upgrades and 24-hour phone and e-mail tech support. At the high end, the EX21000E provides up to 21 terabytes of storage and upgraded hardware. It retails for $59,000.
The District Court installed two EX1000 units. The first was installed in place of the old tape drive machines. The second device was placed off-site for disaster recovery. Updated data from the local EX1000 is periodically sent to the other system. To avoid having to replicate an entire landing zone remotely on the backup unit, it was initially installed locally for that part of the replication. Once the large landing zone was copied over, it was moved to the off-site location and only receives the change information from the main appliance. That way it doesn't tax network resources while still keeping 100 percent of the data safe.
"We now perform a full backup weekly," Isbell said. "The incremental backups happen every day, though I don’t have to do nearly as much as before. I just look at one server and make sure its completed 100 percent of the backup. And I don’t have to remember to change the tapes."