Power User: Here's a guide to finding what's invisible on the Web

John McCormick

Many Web users know how to get the most out of the Google.com search engine or its up-and-coming rival, AllTheWeb.com. Also useful are metasearch engines such as
Dogpile.com, Ixquick.com, Net-comber.com, Pandia.com and Search.com.

Ixquick.com queries 10 search engines and organizes the top 10 unique results--a good way to get a fast overview, but with little if any actual data.

QbSearch.com queries up to 17 search engines and presents up to 20 pages of results from each.

These valuable resources make it increasingly difficult to avoid crashing your system because you keep so many browsers open.

Navigatesmarter.com has a feature called Quickbrowse, which eliminates the need to open a dozen browser sessions or keep clicking back and forth. Quickbrowse stores the links as you click on them. When you're done, click on the QuickBrowse bug to see all the links you selected on a single screen.

Beyond the billions of pages indexed by these search engines, there is a far larger, so-called invisible Web of databases. They generate a dynamic page for each search, so the information can't be indexed by search engines.

You can locate some of these databases via search engine if you type the word "database" in the query, but that merely finds the front end, not the actual information.

Other invisible Web pages have no fixed uniform resource locator or are formatted in a way that makes them difficult to index. Many use scripts to generate results, and the addresses include question marks, which search engines are generally programmed to ignore.

Also, any site that requires a password can't be indexed by search engine spiders. You can often obtain a password to such sites just by asking, but first you have to know they exist.

InvisibleWeb.com has links to thousands of searchable databases. Enter a general search category, and the site will recommend one or more databases. InvisibleWeb also has a brief but useful description of each database, such as whether it is free or requires registration.

Lii.org, a librarian's index, has links to research on about 10,000 topics. You will probably find one very good link per topic. Only a tiny fraction of this information is indexed by search engines because most topics include a script with a question mark.

Direct Search is one of the best collections of obscure research links and databases I have found. It recently disappeared from its old location and reappeared at www.freepint.com/gary/direct.htm. Some of the most important links went dead in the newest version, however.

Many older links are also saved at Archive.org, but you should probably edit the URL and do a direct link to the site for the most current information.

For example, www.archive.org shows the Navy's directives at web.archive.org/web/20011212035256/neds.nebt.daps.mil. But you probably want to go to neds.nebt.daps.mil, which will have the latest page, not the archived page. This is a bit roundabout but so useful that it's worth the trouble.

Don't ignore CompletePlanet.com because it looks like a basic site. It's a vast resource for the invisible Web although it's sometimes difficult to find what you need.

You should always evaluate the quality of information you dig up from the Internet. For a refresher course in research principles, check out the tutorials at library.albany.edu/internet and liberkeley.edu/TeachingLib/Guides/Intenet/FindInfo.html. There's another good tutorial on Web searching and sifting at thelearningsite.com/cyberlibrarian/searching/ismain.html.

These and other links are compiled at my new site, Helpdotcom.com.

Please send suggestions about other hidden sites or databases you know about, and I'll post them.

John McCormick is a free-lance writer and computer consultant. E-mail him at powerusr@yahoo.com.

inside gcn

  • eclipse (kdshutterman/Shutterstock.com)

    NASA calls on cloud for online eclipse streaming

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group