Why you should know the difference between search tools and discovery tools

 

Connecting state and local government leaders

There is some overlap among search, information discovery and e-discovery tools. But the way those tools conduct searches and present information differentiates the three concepts, Internaut columnist Shawn McCarthy says.

Government information technology workers might have heard the following three phrases used interchangeably: search tools, information discovery tools and e-discovery tools.

Depending on your definition, there is some overlap among the concepts. But there also are significant differences. Thus it’s important to understand the subtle and sometimes not-so-subtle differences among the terms, especially as government agencies are entering more information into sprawling storage and data archiving systems.

All three terms relate to seeking information across multiple data archives. But the three concepts are differentiated by the way searches are conducted and the presentation of results.

Search tools. This term often is used in a generic way to refer to multiple types of internal or external search engines, directories and information archives. Most search tools are usually designed to interact with a computer program — often a crawler, spider, indexing bot or similar system — that was created to retrieve documents or data. The crawler and its associated search tools can be set up to interact with one specific database, a set of databases, a single computer network or even the full Internet. When using such tools, searches often are based on a keyword, set of keywords, or a phrase that can be contained in one of the files that was indexed by the spider.

Doing a simple keyword search can be useful, unless there is ambiguity about the meaning of the term. For example, if you search for the word "Saturn," do you mean the planet, car, rocket, or old Sega Saturn game console?

To help resolve ambiguity, some search engines also collect information from a file's metadata fields. Metadata can be useful for setting the context of a keyword. If metadata indicates that a file contains information about the solar system and planets, a good search engine would assume that any matching keywords in that file refer to Saturn the planet, not Saturn the car brand.

But what if someone searching for Saturn the planet doesn’t remember the name of the planet? Or what if they are looking for information about planets in general, and they simply enter the name Saturn as one example? What they really need is more guidance built into their search results.

Information discovery tools. Some types of information discovery tools are simply multiple search results presented in a logical way to help users make additional choices. Some of the results are just interfaces to secondary search tools, arranged to help guide an evolving search.

A basic example of a discovery tool is the “Did you mean” feature that Google presents if you misspell a search term. Besides executing a keyword search, Google's search system also looks through a database of common misspellings. If it finds a match, the search results page helps you discover a correct spelling. But it doesn’t automatically assume you meant the correct spelling, so it still offers keyword matches for the misspelled version of your word.

Discovery tools can help refine your search or ask questions to help you make additional search decisions. Two excellent examples include the Recent Activity boxes on eBay or the “People who bought this book also bought” links on Amazon.com or Barnes and Noble's Web site. By tapping other databases and not just their own index of keywords and matches, those sites make fairly accurate predictions about other things that you might be looking for.

Information discovery should not be confused with semantics. In general, semantics means identifying the meaning of a word or phrase, and the Semantic Web efforts championed by the World Wide Web Consortium have made great strides in helping people understand this issue.

But the semantic approach is not a perfect solution when the people doing the searching don't know the specifics of what they are looking for, much less the exact word. Thus information discovery comes down to three things: available paths, context and pattern matching.

Available paths can be represented through additional line items that offer parallel choices from other databases. This is similar to Google’s “Did you mean” choice but significantly expanded to many different conduits of information. When a good information discovery interface is used to search for Saturn, you might receive a straight set of search results that is complemented by other options. Sometimes, they are presented as small search results boxes with two or three matching choices, plus a link that will take you down that particular result’s path. Other sets of search results might include links from a database on the solar system, a few documents on gas giants, a handful of pictures of planets with rings, and so on. The results help you discover other paths and encourage you to refine your choices. Following one of those paths in turn takes you to other search tools and resources.

Context comes into play when the search system already knows one thing about you or your search. It uses that knowledge to limit search results based on what it already knows. One great example of context is location. If you use your mobile device to search for, say, gas stations, the context can be limited to a 10-mile radius of your location. This function lets you discover nearby resources. Likewise, you also can find restaurants, ATMs or known criminals.

By limiting our example to, say, police needs, the various pieces might combine this way: Your context is where you are. Your search is for people who own blue Ford trucks. Your discovery tools present the various paths that have been enabled for you in the search results. Possible examples: Pull-down menus that define the age of truck, people who live in apartment buildings, the age of truck owners, and so on. Truly flexible discovery tools let you follow one path and then adjust settings without needing to start your search over again — such as expanding your search to 20 miles or limiting results to just Ford F-150 pickups.

Pattern matching applies if the discovery tools also recommend links that other people think are useful.

E-discovery. This is a different concept than the two terms we just reviewed. It can involve search tools, but e-discovery usually refers to a discovery process related to court cases, in which someone searches for information stored electronically. Information that might be relevant as evidence in a lawsuit includes e-mail; instant messages; logs from online chat rooms; stored electronic documents of all types, including older versions of files; databases, including research, product information, and accounting or finance databases; Web sites; and even raw data files. Because litigators might need to review e-discovery materials in a number of ways, it's not unusual for discovered information to be saved in multiple formats.

E-discovery tools often exist as specific applications, and they are popular with people who manage large archives of government information. In late 2009, EMC acquired e-discovery vendor Kazeon. With this addition, EMC offers a set of e-discovery and litigation readiness applications.

Understanding the differences among searching, information discovery and e-discovery can help government employees understand and use the concepts. That goes a long way toward helping people find the right information at the right time to do their jobs.

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.