Web sites are far from dynamic unless editor, tool hit on same page

If your government World Wide Web site serves dynamic pages instead of static Hypertext
Markup Language documents, chances are you're only moderately happy with the results.

That's one of the growing pains as the Web matures from a place where we browse flat
text and graphics into a dynamic interface for multiple networked resources.

I've been experimenting with headline news pages that pull the page titles and first
paragraphs of stories from three different news databases. In theory, portions of these
pages could change whenever new stories are added to the archives.

Problem is, this full automation produces too many false hits. Most pages still need a
human filter-a Web editor-to produce truly meaningful content.

The trick is to find a way for the editor and the automated tools to work in concert,
so the editor can spend just a couple of minutes instead of hours updating pages.

I've been working with the Search97 Information Server from Verity Inc. of Sunnyvale,
Calif. Its topic sets can encompass words, phrases or concepts about a particular subject.

For instance, if I wanted to find military-related articles in the collections, I might
create a topic called Defense Department. My topic set might have such phrases as Air
Force, Defense Information Systems Agency and Pentagon, each word carrying a certain
weight and the word combinations having a higher weight. A search would return a ranked
set of pointers to stories with one or more of the desired words.

Sounds easy, but it's risky to dump these pointers automatically into a latest news
page. Yes, a given file might contain the words Pentagon and Air Force, but what if
they're in rock lyrics instead of true DOD news?

Your agency will be embarrassed if some of your site's pointers turn out to be grossly
false hits.

The way to improve quality is to take a step back from full automation. I've been
experimenting with an editor's workstation that uses a Web interface to manage the way
visitors see data. Using a scripting language that comes with Search97, our programmers
have developed a series of Common Gateway Interface scripts that let Web editors preview
search results and filter out irrelevant hits before publishing final pages.

Instead of using Verity's search engine to generate a page based on the search results,
the editor generates a list of potential files. Viewed in a Web browser, there's an upper
frame with a list of pointers to those files.

Click on a pointer and the file appears in the lower frame.

If the editor approves, there are three short form fields beside each pointer for
assigning a rank to the item, rewriting the title if it isn't clear and pasting in a
descriptive sentence or two.

The Web editor then generates a page, reviews it, and chooses whether to publish or
regenerate it with new selections. The whole thing takes five minutes per page.

Because the page comes from a standardized template, it takes only about an hour to add
a new page on a new topic, complete with new title graphics. Building the editor's
workstation required about 45 hours of programming.

As more and more data comes with HTML tags, a tool like Verity or Fulcrum Technologies
Inc.'s SearchBuilder or Tympani Development Inc.'s NetAttache can help you automate
database queries on multiple local or remote resources merely by filtering returned
pointers for a single keyword or a full topic set.

Viewing the results in a Web browser greatly simplifies things for the end user. And
editing the results for clear presentation helps tame the chaos of information overload.

The Verity server also has an Intelliserv option that supports server-push technology.
Visit http://www.verity.com
for more information.

Shawn P. McCarthy is a computer journalist, webmaster and Internet programmer for
GCN's parent, Cahners Publishing Co. E-mail him at smccarthy@cahners.com.

Stay Connected

Sign up for our newsletter.

I agree to this site's Privacy Policy.