Tiered storage

 

Connecting state and local government leaders

Dividing storage resources into multiple layers offers both cost and performance advantages.

Storage growth is a major problem facing all types of IT managers. While storage costs per gigabyte are plummeting, the demand for capacity is rising even faster. As a result, agencies face two related but distinct issues: how to cut storage costs while continuing to provide users timely access to their data.When it comes to access, the Geological Survey's Data Center in Middleton, Wis., is ahead of the game. The center uses 40GB of Solid State Disk devices from Texas Memory Systems Inc. of Houston to hold its most active databases in RAM.'The solid-state disks hold the data that is high priority to give to customers fast, or it might be data files that are hot and get hit a lot,' said data center director Harry House. 'If you are I/O bound, SSD is a godsend. You can achieve some real performance breakthroughs with it.'But just as important as cutting costs is the long-term preservation of electronic data. The National Archives and Records Administration is leading the way in this area. Last year, NARA awarded a $308 million contract to a team headed by Lockheed Martin Corp. to begin establishing the Electronic Records Archive system.'NARA's in the business of archiving information for the life of the republic, and the electronic records will continue to grow,' said Clyde Relick, Lockheed Martin's program director for the ERA contract. 'Essentially we are building a system that has to be able to incorporate new technology and be scalable for having unlimited amounts of storage.'Whether one is concerned with providing enough disks right now or looking at an archive to last the millennia, proper system design is essential.The term 'archiving' is used for two distinctly different purposes. One is as part of a standard backup or disaster recovery program, where data is put on tape for offsite storage. The other purpose is to make the data available for long-term access. In this case, the data can either be stored on disk or tape.An approach growing in popularity of late, multitiered storage is primarily a means of balancing costs and availability. According to San Francisco-based consulting firm the 451 Group, primary SCSI disk storage costs about $2 to $6 per gigabyte, secondary ATA drives about 50 cents per gigabyte and tape only 12 cents per GB.From an availability viewpoint, everything would be on disk. From a cost viewpoint, everything should be on tape, but tape doesn't meet the need for availability.'Tape in general is increasingly being exposed as a substandard medium for backups,' said Simon Robinson, storage research director at the 451 Group. 'Users like it because it's cheap; but apart from that, it's inherently unreliable and delivers poor performance.'Backup, Continuous Data Protection, mirroring, replication, snapshots and other technologies typically use some form of tiered storage. This approach lets you prioritize stored data and use different storage media for each type'say, mission-critical Tier 1 data on optical disks, less commonly used data on ATA, and rarely used or backup data on tape. Tape is the old standard for backups, but disk technologies are gaining popularity as the prices for the two media converge.'It's exceptionally difficult to restore from tape (especially if the data you want is very old or is stored off-site), which if you think about it is the whole point of backup,' says Robinson.'Tape still has its place as a longer-term archive format, but it is being superseded at a rapid rate by disk, especially in larger enterprises,' he added.The trick is to find the optimum balance between tape and disk. This is where Information Lifecycle Management comes in."Archiving is a mechanism to expand your tiering strategies and provide the right-cost component to business needs,' said Robert Stevenson, managing director of storage for The Info Pro in New York. 'Archiving is looking at how data changes over time. You can control consumption of Tier One and Tier Two high-cost storage by moving data to a tertiary tier.'But in doing so, it is not necessary to archive everything.'There is a difference between archiving and hoarding,' said Dorian Cougias, CEO of Network Frontiers LLC of Oakland, Calif., as well as the co-author of The Backup Book: Disaster Recovery from Desktop to Data Center (Schaser-Vartan Books). 'Archiving is done to fulfill a compliance obligation, and 95 percent of the data you are storing on the network falls completely outside that scope.'Information Lifecycle Management, or ILM, is a strategy for automatically moving data from one storage tier to another, in order to cut costs of storing less frequently accessed material.For instance, notice how a bank handles a customer's deposit, Cougias said. For the first few weeks, the teller can produce a copy of the transaction. Then it goes onto a system the bank manager can access. After six months, the data is archived, and the customer has to put in a request and wait to receive a copy. Eventually the data is erased or destroyed.Robert Eckstein, assistant network manager for the Ninth Circuit Court of Appeals, manages 400GB of single-tier storage, which is backed up on tape. That system is adequate for now but may not meet future needs. 'We are looking at ILM,' he said.'We expect that our storage will increase significantly when our court is on the new Case Management/Electronic Case Files system. At that point we will need a much more in-depth storage system.'[IMGCAP(2)]The simplest way of implementing ILM is to store all the data initially on Tier One storage and then migrate the material to other, cheaper tiers over time. But this approach doesn't necessarily meet all business needs, so more complex sets of rules'based on the types of documents being held or how often they have been accessed'have been suggested by vendors. This approach, too, has its limitations.'ILM has ended up being used by the industry to create a notion that data will be created on a certain class of storage, then'based on policies, age or something else'will dynamically migrate to lower-cost storage,' said Manish Goel, vice president and general manager of data protection and retention for Network Appliance Inc. of Sunnyvale, Calif. 'That is such an administratively complex architectural solution and has never really taken off.'That doesn't mean, however, that the basic concept of moving data through different levels of storage is incorrect, but that a mature approach is needed to define a strategy appropriate to one's own needs.Databases, for example, need to be handled differently than documents. The Transportation Department's Research and Innovative Technology Administration (RITA) has 15TB of storage for data warehousing of transportation-related databases that are used for analysis and reporting, data collection and processing systems, and Web sites to make that data publicly available.'At this time, all of our data are maintained on a single tier; we have not archived anything,' said Terry M. Klein, director of the Office of Information Technology and deputy CIO. 'We do, of course, back up all our data to tape.'The USGS data center also keeps its databases active, but migrates them to different types of storage based on the level of requests for data. Those that receive more requests, or require faster response times, stay on the Tier One SSD devices.Others with a lower number of I/O requests stay on Tier Two disks. Middle-tier storage consists of several terabytes of databases on Network Appliance storage appliances. These are then backed up to about 10TB of disk storage devices from Excel Meridien Data Inc. of Carrollton, Texas. Finally, the data is archived to tape and moved off-site.The University of New Mexico's Health Sciences Center, however, does use the traditional ILM concept of migrating documents with age. The university has 10TB of storage for general-purpose information, which largely consists of about 10 million files users have uploaded to the central storage to back up their hard drives.About two-thirds of that is primary storage. The center uses hierarchical storage management software from CaminoSoft Corp. of Westlake Village, Calif. The software creates a stub file in the primary tier that points to the document's location in the secondary tier.'We do it on a simple rule,' said IT systems manager Barney D. Metzner. 'If the file creation date and last access date goes longer than, on average, six months, we migrate it, though there are a number of exceptions for files such as databases and Power Points.'Such systems are complicated, and Metzner said some of his technicians who have to deal with the complexity view ILM as a negative.'I still weigh it as a positive,' said Metzner. 'It continues to keep us in business as our storage needs grow.'Beyond moving data to cheaper disks or archiving to tape to cut storage costs, the type of archiving for long-term access has its own set of challenges. To begin with, there is the ability to access the data despite changing technologies and the deterioration of storage media.'Government agencies face the same problems everyone does: maintaining secure and cost-effective long-term readability, physically and logically,' said Michael Peterson, the program director of the Storage Networking Industry Association's Data Management Forum. 'Media has to be migrated every three to five years to assure physical readability, and application data formats have to be maintained throughout revision changes, application changes, and reader changes.'There is also the matter of finding the data once it has been archived. It is not the same as restoring a file from a backup tape.'Backup is for mass restoration,' said Cougias. 'Archiving is 'Give me the needle in the haystack, and I want it in a readable format.' 'Before issuing an RFP, therefore, he recommends doing a thorough analysis of what one's compliance requirements are, defining what a record is and then defining how those records will be used. Typically, only five percent of the data actually needs to be archived.An agency can't just take a cut-and-paste approach to writing an RFP for its own archive, said Grant Stephen, CEO of Tessella Inc. of Newton, Mass. Tessella oversaw the national archives projects in the U.K. and the Netherlands and is part of the ERA project team.Each organization should have its own policies and procedures for data creation, security and storage. Unless these are examined ahead of time and an actual understanding reached of what needs to be archived and how it will be accessed, the RFP winds up being self-contradictory. (Stephen said he has seen a few of those.)'The organization has spent millions or billions or hundreds of billions of dollars building the system,' he said. 'The number One thing is to stop thinking about the problem and start dealing with it.'

RFP Checklist

The National Archives and Records Administration is doing a lot of work on the long-term archiving of data and has briefings, standards and training available for other agencies. Consult the Electronic Records Archive site'www.archives.gov/era'to ensure that your own proposal will be compatible with ERA. You can also view NARA's own RFP on the site as a guide to preparing your own, but it won't match your exact needs. Some factors to consider in preparing your own RFP include:

  • How much primary storage do you need? How much secondary storage do you need? How much tertiary storage? Can you cut down that quantity by deletion or deleting duplicate items?

  • What data formats need to be stored?

  • How does the data get moved from one storage tier to another? Is it an automatic or manual process?

  • Do certain types of data reside on different tiers? Or does the data migrate from one tier to another based on age, frequency of access or some other policy?

  • Will the data be archived on disk so it is readily available, or will it be sent to tape?

  • What will the archived data be used for? How will users access it, and through what type of application or interface?

  • How will that data be indexed and searched?

  • What policies will be used in determining what data gets archived and what gets deleted?

  • How long does the data need to be kept available? Is it the same amount of time for everything, or do different types of data have different life spans?

  • What type of data classification tools will be used?

  • How do you ensure data security requirements are met as the data migrates from one storage layer to another? Do you need to maintain separate physical systems for the classified data, or can you go with a software security system? What about maintaining confidentiality of medical or personnel records which don't depend on a security level, but a need to know?

  • If the data is being stored on tape or optical disk, how long will the data on that medium be readable? What mechanisms need to be in place to copy the data onto new storage media, and on what time schedule? Who is responsible for doing this?

  • How will you deal with changes in data formats, applications and hardware over the years so that data is still accessible?

  • How will your solution integrate with or meet the standards of NARA's Electronic Records Archive system?

  • What other standards will be used, such as XML Archive or PDF/Archive?

  • Does the system comply with DOD Standard 5015.2, ISO 15489, the Federal Enterprise Architecture Records Management Profile?

  • Is your agency going to act as the project lead in coordinating the hardware and software vendors, or will you hire a single primary contractor?

Alternatives: The IBM DS8000 storage system (top and left) can hold up to 96 petabytes of data. IBM System Storage DS4000 Series (below) is aimed at the storage needs of small and midsize organizations.

TWO APPROACHES: Network Appliance's NearStore R200 (left) is a disk-based secondary storage device. Right a five-bay EMC Symmetrix DMX-3.







Breaking ground







Storage vs. archiving























What's ILM?


































Long-term view













X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.