Library of Congress preservation program works with millions of items, terabytes of data in a full spectrum of formats

 

Connecting state and local government leaders

With more than 100 million physical items already in its collections and new digital material coming in by the terabyte, the Library of Congress is developing standards, specifications and tools to help ensure that this material remains accessible to future generations.

The library now has three broad initiatives under NDIIP: working with universities and libraries to understand the nature of digital content, working with state consortiums to help in the preservation of state government records, and working with commercial content providers to develop standards for digital preservation.

GCN Awards We are warned to be careful about what we put online because data on the Internet lives forever. But keeping random copies of files on servers, routers and databases is not the same as preservation, said Martha Anderson, director of program management for the Library of Congress’ National Digital Information Infrastructure and Preservation program. Digital data can be ephemeral. “That is the paradox,” she said.

Web sites can disappear in a matter of days or change repeatedly in a matter of hours. Files can become lost or corrupted, formats and hardware change, and physical media such as tapes and disks deteriorate far more quickly than anticipated.

So Congress charged the LOC in 2000 with preserving the nation’s digital heritage and at the same time making sure that its collection of 29 million books and 105 million other items gathered over the last 200 years remains available for the next 200 years. Toward that end, NDIIP has developed specifications and tools for the transfer of large digital files; worked with government, academia and industry on best practices for digitizing and preserving data; established programs to use delivery platforms such as Flickr to make LOC content available; and partnered with the private sector to harvest content from the Web for archiving.

The Library of Congress Office of Strategic Initiatives

Across the board: The LOC’s Office of Strategic Initiatives is preserving everything from books to Dictabelt recordings.


The challenges of wedding a physical past to a digital future are varied.

“The biggest difference is the element of time,” Anderson said. “Some physical artifacts can be put on a shelf and left for many years. Books from the 18th century are fine, for example. This is not so much the case with sound recordings and film.”

The technology changes and the media deteriorate. Finding playback equipment for a wax cylinder, an old movie or a Dictabelt can be difficult. And when they are available, the cylinder, film or belt might not be playable.

“The whole domain is looking to digital to carry this forward,” Anderson said.

But digital conversion is time-consuming, and each type of material requires its own technology and special handling. Although the library has been working since the 1990s on digitizing its collections and has made millions of files available online, Anderson estimates that only about 1 percent of the library’s holdings have been digitized.

And digital data can be tricky to handle. “Some formats are fairly stable,” Anderson said. Text and image files have not changed a lot in recent years, and there are plenty of PDF, TIFF and JPEG files that can be easily opened today. But sound and video formats tend to change more quickly. And the physical environment for storing and accessing files changes rapidly. “Servers and digital storage are a challenge. These turn over every three to five years and everything is moved off to another server.”

To accomplish its digital mission, the library takes advantage of work being done in industry and academia to establish standardized environments and tools rather than developing everything itself. Among the initiatives NDIIP is participating in are:

  • Development of the BagIt protocol for large data transfers.
  • A collaborative Web site for federal partners developing guidelines for digitizing records.
  • The National Digital Newspaper Program, in collaboration with the National Endowment for the Humanities, to digitize and preserve regional newspapers.
  • State-of-the-art facilities at the library’s Packard Campus for preserving the world’s largest collection of audiovisual works.
  • Partnering with universities and the Internet Archive to harvest and preserve more than 69 terabytes of content from the Web.
  • Supporting standards development for digital content, including Office Open XML, PDF/A and JPEG2000.
  • Development of open-source tools for receiving, archiving and accessing data in digital repositories.

It is not the technology that poses the greatest challenge to digital preservation, Anderson said. “The biggest challenge is social, getting organizations to understand the value of digital materials.”

Most organizations focus on day-to-day operations without concern for preservation. “We would like to make preservation a part of regular operations and workflow,” she said. Part of the problem is the complexity of the tasks. “It is very complicated even to archive your own e-mail at home,” so preservation has not yet become a part of everyone’s digital environment.

Another challenge in establishing long-term programs for digital preservation is the speed of change in the digital environment. “Our job as we saw it in 2001 was much simpler than we see it today,” Anderson said.

When Congress gave the library the job of digital preservation, there was no Wikipedia, Google Maps, Flickr or Facebook. Today, those tools and others like them have changed the way digital content is created and distributed.

“We worked for nine months to gather video from the Internet,” Anderson said. “During that nine months, YouTube came onto the scene and changed everything.”

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.