NARA's Web archive plan irks agencies

NARA's Web archive plan irks agencies<@VM>As administration changed hands, agencies began Web site face-lifts


Agencies are learning that taking a snapshot of a Web site isn't as simple as clicking a camera shutter.

Just eight days before President Bush took office, the National Archives and Records Administration set out exacting criteria for archiving all government Web sites created during the Clinton era.

In essence, NARA asked all agencies to provide snapshots of their sites' content at the time of the presidential transition.

Agencies complained about the amount of work demanded on short notice. So on Jan. 26, NARA announced greater flexibility in the types of filenames, storage media formats and documentation it will accept.

'We wish we could have announced it earlier, but we couldn't,' Nancy Allard, a member of NARA's policy and communications staff, said of the original request. There were 'too many balls in the air,' she said.

Regardless of whether agencies changed the content or appearance of their sites along with the chief executive, NARA wants to know what was on the sites as close to Jan. 20 as possible.

On Jan. 12, deputy archivist Lewis J. Bellardo asked agency chief information officers to take snapshots of their sites and forward them to NARA within 60 days of the start of the Bush administration.

Allard acknowledged that it would have been better to give agencies much more notice, but NARA had to act quickly once officials realized that delay might result in a loss of records.

Bellardo's memorandum, at, instructed agencies to copy all Web documents available to the public up to Jan. 20, save them on certain kinds of storage tape or CD-ROM, and ship them to NARA's facilities in College Park, Md.

Deputy archivist Lewis J. Bellardo asked agency CIOs to take snapshots of Web sites.
The original instructions required nine-track tapes or 18-track, 3480-class tape cartridges. CD-ROMs had to conform to the International Standards Organization 9660-1990 specification, which mandates eight-character filenames with three-character extensions.

The media requirement followed NARA's usual procedures for making backup copies of electronic records as soon as they arrive, Allard said. But when the Jan. 12 memo came out, many agencies complained.

Uncompressed files

The second set of guidelines let agencies submit snapshots on digital linear tape cartridge. NARA also permits CD-ROM storage that complies with the so-called Joliet modifications to ISO 9660-1990 for handling long filenames.

'What we were trying to do is remove barriers,' Allard said of the expanded guidelines. Whatever storage media agencies choose, NARA wants files in an uncompressed format so that staff members don't have to deal with unzipping them, Allard said.

'The snapshots were not hard to do,' said Ruth M. Doerflein, Internet technical manager and central webmaster for the Health and Human Services Department.

On Jan. 16, Doerflein instructed HHS bureau webmasters to take their snapshots as close to the Jan. 20 swearing-in as possible and save them either on special backup tape or in a locked-down area of a server. The snapshots haven't been touched since, while the agency's webmasters seek additional guidance regarding NARA's rules.

Doerflein said her staff had 'major problems with a monumental amount of labor hours and budgetary expense' that would have forced HHS to seek an extension of the 60-day deadline.

NARA's own Web site has filenames longer than eight characters with three-character extensions, Doerflein said. NARA 'would have had a problem meeting their technical specifications,' she said.

On Jan. 24, NARA officials held a teleconference with records officers, information technology staff and webmasters from HHS and other agencies. Doerflein said she faxed questions in advance.
The records agency initially wanted webmasters to terminate all external links or insert pages redirecting viewers to originating pages, Doerflein said. NARA removed that mandate on Jan. 26.

The original requirement for accompanying documentation also confused many webmasters. Some thought NARA was asking for one form per page within each site, a task they deemed unworkable.

Allard said NARA always intended to require one completed form for an entire Web site, and that was clear in responses to questions. The records agency also released a simplified two-part form to accompany snapshots.

Some agencies interpreted the original requirements to mean that all files should be saved in ASCII'but that is impossible for image files, which should be maintained in the original format, Allard said. The revised documentation form asked webmasters to check off the file formats of each site.

Few submitted so far

HHS webmasters are still sitting on their snapshots until they work out a few remaining issues with NARA, Doerflein said. She estimated that 90 percent of the department's snapshots are ready for transfer, however.

Agencies are making good-faith efforts to follow through with the snapshots, Allard said. Most have either compiled the snapshot files or have made separate backups from which to compile the snapshot and now are working on the documentation. But as of last week, NARA's Electronic and Special Media Records Services Division had received submissions from only a couple of small agencies, she said.

NARA officials expect that most agencies will meet the March 20 deadline for turning over the snapshots. Those who think they might miss the deadline should contact NARA as soon as possible, Allard said.

'You have paper people trying to regulate electronic media based on paper specs,' Doerflein said of the confusion surrounding the snapshot mandate.

NARA officials hope to clear things up by holding a forum for records officers and technical managers on Feb. 9 at the Archives II building in College Park.

In addition, NARA is seeking webmasters and records officers to join a new focus group on long-term management of federal Web records.Of all the federal Web sites that changed to reflect the new administration, the White House and State Department sites underwent the greatest transformations.

Right after George W. Bush's inaugural ceremony on Jan. 20, a new White House site replaced the Clinton-era pages at Simultaneously, State revamped, around a database-driven structure.

The Clinton Presidential Materials Project, online at, depicts not only the White House site at the close of the 42nd president's term, but also the site's evolution since it came into existence in July 1994.

Old Clinton sites are searchable from the project site. For example, a search for Socks, the Clintons' former cat, yields 36 results on the 1995-96 version of

Colleen Hope, director of the Office of Electronic Information in State's Bureau of Public Affairs, said she had been planning for a long time to change, setting apart the Clinton years from his successor's State site.

The federal depository library at the University of Illinois has worked with State to archive Web pages developed by the Arms Control and Disarmament Agency and the U.S. Information Agency, both of which have become State agencies.

State is now switching to a commercial Web hosting service, WorldCom Inc.'s UUNet subsidiary, which can provide Oracle8i database support and server redundancy, Hope said.

Another contractor, United Information Systems Inc. of Bethesda, Md., designed the new site, Hope said. State's old pages are available through the Archive link on its new home page.

Ruth M. Doerflein, Internet technical manager and webmaster for the Health and Human Services Department's home page, said she was waiting until after HHS Secretary Tommy Thompson's swearing-in to learn whether her department's site will undergo significant overhaul.

Doerflein said she isn't stressed out, because 'I know how long it took me to get the current home page out the door once the politics got involved.'

The Agriculture Department's home page, which reflects the arrival of new Secretary Ann Veneman, is among the sites that have changed.

But USDA webmaster Vic Powell said most of the department's ongoing Web upgrades are unrelated to the change at the White House.

'It was time to update and freshen the site,' Powell said.

'Patricia Daukantas


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected