GPO seeks tools to harvest overlooked documents on the Web

GPO seeks tools to harvest overlooked documents on the Web

The Government Printing Office is looking for IT products and services that will help it find, harvest and review documents from federal Web sites.

GPO wants to use Web crawler and data-mining technologies to retrieve publications from Web sites to identify those that agencies have not catalogued for its Federal Depository Library Program and the Cataloging and Indexing Program. The request for proposals appeared last week.

Federal agencies are increasingly publishing information only in electronic formats and frequently fail to inform GPO of new publications that should be included in the depository library and cataloging programs.

Web crawler and data-mining technologies will help GPO identify and collect such documents. GPO plans to launch a pilot with the Environmental Protection Agency to crawl the main EPA and sub-agency Web sites.

GPO expects the project to be completed six months from the award date and to be valued at up to $75,000 under a firm-fixed-price contract. Responses are due Jan. 31.

The federal library depository provides government information to 1,250 libraries across the nation. The catalog program is made up of bibliographic records of federal information published by the executive, judicial and legislative branches. GPO prepares machine-readable records for the Online Computer Library Center bibliographic network.

About the Author

Mary Mosquera is a reporter for Federal Computer Week.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected