Science.gov 4.0 delves deep into the Web

The latest version of Science.gov, the search portal that trawls the Web for scientific information in 30 federal scientific databases and more than 1,800 Web sites, features a relevancy ranking architecture that can retrieve the full text of documents.

Launched today, Version 4.0 uses DeepRank, a relevancy ranking algorithm that returns more targeted results than previous versions.

DeepRank uses information gathered from the full-text document to perform relevancy ranking. Earlier versions of Science.gov relied on MetaRank, which ranked queries based on metadata, bibliographic information such as title, author, date or abstract, and QuickRank, which relied on the document's title and short snippets of information.

DeepRank actually downloads and indexes documents, said Walter Warnick, director of the Energy Department's Office of Scientific and Technical Information. Commercial search engines such as Google crawl the Web by attempting 'to visit each Web page they can find and make an index of that page. Science.gov does federated searching,' searching pre-identified databases. 'When the hits come back, they have to be sorted,' Warnick said. 'Otherwise patrons will be overwhelmed with hundreds of thousands of hits.'

All three relevancy ranking algorithms'DeepRank, MetaRank and QuickRank'were developed by Deep Web Technologies of Santa Fe, N.M.

Science.gov is free and requires no registration. The portal is hosted by the Energy Department's Office of Scientific and Technical Information. Members of the Science.gov Alliance include the Agriculture, Commerce, Defense, Education, Energy, Health and Human Services and Interior departments, and the Environmental Protection Agency, the Government Printing Office, NASA, and the National Science Foundation. Some support is also provided by the National Archives and Records Adminstration.

About the Author

Trudy Walsh is a senior writer for GCN.

Featured

  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/Shutterstock.com)

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected