Power search

Could search technology replace some information management standards?

The federal government is beginning to see what the private sector has already discovered: Search technology could be the answer to all its information management problems.

A recent request for information, issued jointly by the General Services Administration and the Office of Management and Budget, asks whether search technology is powerful enough to replace some government standards for information management.

"Does current search technology perform to a sufficiently high level to make an added investment in metadata tagging unnecessary in terms of cost and benefit?" the Sept. 15 RFI asks. Responses are due by Oct. 21.

The notice will likely lead to and shape procurements in the next decade, according to supplementary information on the Federal Business Opportunities Web site. Some people say existing technologies that can fulfill the request are ready and waiting for the government to notice them.

Suggested approaches must meet the wide-reaching aim of identifying the most cost-effective means to search for, locate, retrieve and share information. The notice lists seven scenarios to provide context.

For example, the government is looking for information on how to help a physician search multiple databases and Web sites for treatments for a defense contractor's unexplained illness. The doctor might not know which agencies provide information on unexplained or service-related illnesses. He or she would also need a way to search nongovernmental sources, and some of the information might not be easily accessible through traditional Internet search engines.

In addition to tackling information sharing, vendors' suggested approaches must address the problem of access.

The RFI appears at a time when popular commercial search engines --
such as Google, Yahoo and Microsoft's MSN Search --
are about to retire a 10-year-old government search standard intended as an electronic card catalog of public government information.

The National Institute of Standards and Technology wants to withdraw the Government Information Locator Service because the agency considers the search standard obsolete. A July 15 Federal Register notice states that recalling the standard, also known as International Organization for Standardization's ISO 23950, seems justified because most agencies now use commercial search tools to help people locate government information.

Accordingly, the RFI seeks approaches that could avoid the use of government-mandated standards.

Alternatively, the notice asks vendors to explain why they believe government standards are not necessary or cost-effective.

Some government computer programmers say they are impressed by GSA's and OMB's foresight in issuing the notice.

Tamas Doszkocs, a computer scientist at the National Library of Medicine, has been working on the metasearch and clustering engine ToxSeek for almost a decade.

The RFI "is a very good way of taking a look at an extremely complex array of problems and solutions and trying to elicit feedback from major contractors who would be able to address this whole complex issue," he said. "It indicates a keen awareness of the complexity of the problems."

Doszkocs said only piecemeal solutions now exist in industry and government.

"There is nobody that could address and provide solutions to all of the concerns and problem sets," he said. "But there are certainly companies that have formidable technologies that could team up."

As OMB moves forward in soliciting help from industry, the agency is also seeking guidance from federal stakeholders, government officials say.

Last year, for instance, the Interagency Committee on Government Information submitted draft recommendations to OMB on adopting open, interoperable standards. Those standards would help agencies catalog information so that people can search any government system using terms that allow information to be identified electronically. Section 207 of the E-Government Act of 2002 required agencies to develop those recommendations.

The committee's report calls for the federal government to implement a searchable identifier standard that would provide long-term access to digital information. The paper states that the standard should be flexible enough to remain viable as technology changes and specific enough to
provide authoritative access to government information.

OMB officials say they are considering the committee's ideas as they develop policies to foster better public access to government information. They will issue the policies to agencies by Dec. 17.

Karen Evans, OMB's administrator of e-government and information technology, said the RFI language asking whether search technology should replace government standards does not conflict with the E-Government Act.

"The question the RFI asks in no way suggests avoiding the use of standards when such are necessary," she said. "Moreover, it most certainly does not suggest noninteroperable searching. Rather, it seeks to identify where metadata tagging or other formal --
and costly -- advanced information preparation mechanisms achieve the goal of making information more easily accessible to interested parties."

In the three years since the E-Government Act was enacted, she added, improvements in commercial search technologies have altered the Bush administration's attitude toward business information retrieval solutions. The RFI seeks to ensure that the public benefits from commercial advances when it seeks government information, Evans said.

The Government Printing Office is also involved in the information retrieval and sharing initiative.

GPO, the agency responsible for distributing government publications, has assigned several employees to OMB during the past year. One of those employees will soon return to GPO for work on a new digital distribution system capable of verifying and tracking all versions of official government documents.

GPO officials say the system's design will ensure authenticity of government information and permanent public access to that information.

"The RFI will help our efforts since we are working closely with the community that generated the RFI and [that] is developing enhanced search tools," GPO spokeswoman Veronica Meter said.

By July 2007, GPO officials expect to have an operational system that will support Web browsing, downloading and printing. It will also have search tools and redundant data warehouses.

Vendors say intelligence agencies have already succeeded with endeavors similar to what OMB and GSA are looking for.

"The tools and products are already available to support this initiative," said Paul Norcini, federal channels manager at search tools supplier Verity.

Verity's solutions can index data formats from disparate repositories into searchable collections. Other tools then categorize the data based on concepts, metadata and highlighted information.

Indexing facilitates information sharing, while highlighting helps with retrieval, Norcini said.

Agency workers can also simultaneously search government and nongovernment systems with existing technology.

Norcini said OMB and GSA need to consider how their programs will detect patterns and connections among pieces of information.

"There is more to information sharing than just search," he said.

One global consortium is working with foreign governments on a massive information retrieval and sharing project that could influence the U.S. government's path.

Earlier this month, groups from industry, government, academia and nonprofit organizations announced plans to provide online versions of books, academic papers, video and audio to the world. The Internet Archive, a nonprofit entity that offers access to historical collections in digital format, will host the Open Content Alliance (OCA). The National Archives of the United Kingdom has already contributed to the effort.

The OCA "may significantly help the [U.S.] government in doing their public access mission," Internet Archive co-founder Brewster Kahle said. "The OCA is an almost unprecedented collaboration between nonprofits, libraries, government institutions and commercial search engines to bring to life the treasures that are currently locked up in independent collections."

Kahle said he has been talking to GPO officials for the past year about joining the alliance. The alliance will unveil a technology Oct. 25 that performs nondestructive scans of book pages at high resolutions for 10 cents a page. That cost savings could appeal to GPO and its Federal Depository Library Program, he said.

Anyone will be able to search and download works from the alliance's repository for free. Yahoo will provide the search engine, but all content will be available for other major search engines to index.

"The combination of large digital archives and the Internet could allow us to take all the U.S. government information and make it available through technologies such as commercial search engines," Kahle said. "We hope that the government considers the OCA as a way of achieving its aims."


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/Shutterstock.com)

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected