Vivisimo clusters government Web sites
- By Joab Jackson
- Feb 12, 2004
In an effort to demonstrate how its product can augment Web site searches, Vivisimo Inc. has overlaid its clustering software onto 30 government Web site search engines, including the governmentwide FirstGov portal.
Vivisimo's software clusters document pointers that are gathered by a search engine. The software intercepts search results sent back from search engines, either an Internet search engine or one cataloguing a specific organization's Web site, and quickly categorizes the results, placing them in folders for the user to peruse.
The software simplifies search results, said Raul Valdes-Perez, president of the Pittsburgh-based company. Most search engines may return thousands of results for a single query, most of which the viewer does not look at.
'Regardless of how diligent you are, you are always going to overlook information,' Valdes-Perez said. 'On a practical level, we help the person more intelligently choose which information to overlook.' By organizing the information in hierarchical sets of folders, the software gives users a wider overview of the types of information available.
The company has posted on its own Web site a demonstration
that lets users submit queries to federal, state, local and academic sites (The software submits the queries to the agencies' own search engines. When the results are returned, the Vivisimo software removes duplicates and organizes them into folders, divided up by words that frequently appear in the documents.
For instance, a search for documents with the term 'enterprise architecture' on FirstGov itself returns over 1,000 hits. A FirstGov query filtered through Vivisimo's site results in 33 top-level folders, divided into keyword-like categories such as 'Federal Enterprise Architecture' and by individual agencies, such as the Agriculture and Housing and Urban Development departments.
Valdes-Perez and two graduate students first developed the clustering algorithms at Carnegie Mellon University. The company, formed in 2000, further developed the technology with awards from the National Science Foundation and the federally run Small Business Innovation Research program. NASA has been an early adopter of the software, as has an intelligence agency, Valdes-Perez said.
Although Vivisimo keeps a version of the software on its own Web site for searching the Internet, its primary goal is to sell the software to large enterprises, such as federal agencies. Organizations can use the software for intranet searches or for augmenting searches on their public Web sites. The software can work with most commercially available search engines, Valdes-Perez said, including those that index content in a wide variety of formats.
FirstGov is aware of Vivisimo's demonstration, Valdes-Perez said, but the program has not officially endorsed the product.
Joab Jackson is the senior technology editor for Government Computer News.