What is your e-mail address?

My e-mail address is:

Do you have a password?

Forgot your password? Click here
close

    Web sites get results-oriented

    Agencies are working with Google to boost rankings and increase traffic

    PRIMO FINDS: NIH's Dennis Rodrigues says the goal is to boost the quality of search results rather than the quantity.

    Rick Steele

    When people search for federal information online, the vast majority reach first for search engines like Google or Yahoo.

    Only 4 percent of visitors to www.nih.gov, for instance, got there by typing the URL into their browser's address line, according to a ComScore research study released last year. The rest arrived by typing nih.gov into a search engine ' usually Google's ' and then clicking on the results.

    This has set up an interesting dynamic between search engine companies and the federal government. The feds want their sites to appear high on the list of results delivered. Google, Yahoo and the other search engines
    want to have satisfied
    searchers. The more content
    that is searchable, the better,
    and the happier everybody is.

    Users performing a search
    think, ' 'I've been diagnosed
    with cancer, and I need information.'
    They don't think about
    their information sources,' said
    J.L. Needham, who represents
    Google's public sector content
    partnership. 'But if people
    can't find something, they
    blame it on Google, not the
    government.'

    To boost their rankings on
    search lists, agencies have been
    working with Google to develop
    sitemaps, which are Extensible
    Markup Language-based
    lists of Web addresses that
    point to database records.

    A sitemap can take a couple
    of forms, Needham said. At its
    simplest, it can be a list of
    URLs submitted through
    Google's Webmaster Tools Web
    site at www.google. com/webmasters/
    tools.

    Much of the government's
    information on the Web is uncrawlable,
    Needham said.

    'Some estimates are that as
    much as 90 percent of government
    information is not accessible
    through Web search engines,'
    embedded in databases. 'We estimate that
    at about 50 percent,' Needham
    said. A sitemap makes
    this information visible to the
    search engines.

    Opting in

    But does this request for
    sitemaps put Google in the
    tricky position of telling the federal
    government what to do?

    No, said Chris Sherman, who
    is the executive editor of
    Searchengineland.com. 'It's
    voluntary. Web sites don't have
    to do it,' he said. 'I don't think
    any of the search engines are
    dictating anything. Their concern
    is to get as much content
    as they can. As good as search
    engines have become, there are
    still some barriers.'

    Most government Web sites
    do quite well in search engine
    rankings, he said. A sitemap
    will boost a site's ranking if it
    has a lot of the content stored
    in databases. 'Databases are
    tough for search engines to
    crack,' he said.

    Seeking content

    Historically, search engines
    have looked with suspicion on
    content providers, Sherman
    said. 'Now they're saying, 'We
    want your content, and this is
    how to get it.' '

    The sitemap protocol is an industry
    standard, supported by
    Google, Yahoo and Microsoft.
    The actual development of a
    sitemap doesn't take much
    more than a day or so.

    And federal Webmasters
    don't seem to mind complying
    with the protocol. If anything,
    it's a labor of love.

    Setting up the sitemap for
    www.plainlanguage.gov took
    Miriam Vincent between eight
    and 10 hours. Vincent is an attorney
    at the Social Security
    Administration, but she volunteers
    time to the Plain Language
    Action and Information
    Network, an interagency working
    group of federal employees
    who promote the use of plain
    language for all government
    communications. Vincent describes
    herself as the site's
    Webwright ' the -wright suf-
    fix indicating a careful craftsman,
    as in wheelwright or
    shipwright ' not its master.

    Before Vincent instituted the
    sitemap, a search in Google for
    one of the site's specific examples
    of plain language 'wouldn't
    show up on the first page or
    first two pages' of results, she
    said. The site's examples of language,
    both plain and obfuscating,
    are some of its most popular
    features, and they eluded
    search engines.

    Since Vincent implemented
    the sitemap, she has seen some
    increase in Web traffic.

    Now when users type 'plain
    language' into Google, plainlanguage.
    gov is the first result.
    Type in 'plain language' and
    'engineer jargon,' and the site
    is still the first result.

    Vincent has to do a short
    copy-paste step when she updates
    the database, but some
    other federal Web sites have
    managed to automate the
    process entirely, dynamically
    generating an XML file, she
    said.

    It took the Energy Department's
    Office of Science and
    Technology Information 12
    hours to create its sitemap
    using the Google protocol, said
    Walt Warnick, OSTI's director.
    'We've spent more time talking
    about what we did regarding
    the sitemap protocol than
    we did executing it,' Warnick
    said.

    When osti.gov began offering
    sitemaps several years ago, the
    agency saw a huge increase in
    traffic. 'The first day that
    Yahoo offered up our material
    for search, our traffic increased
    so much that we could not keep
    up with it,' Warnick said.

    Everybody wins

    Dennis Rodrigues, chief of the
    online information branch for
    the National Institutes of
    Health, called the sitemap
    project a win-win for federal
    Web sites and search engines.
    Rodrigues coordinates sites for
    27 separate agencies under the
    health agency's umbrella.

    'I think a lot of the breadand-
    butter stuff agencies have
    on the Web sites [was] already
    carefully indexed,' Rodrigues
    said. The bulk of searches sent
    to NIH Web sites are for health
    problems, such as cancer, diabetes
    and heart disease. But it
    would be harder for someone
    looking for information on a
    particular gene or protein, he
    said. The information would be
    buried in a database.

    Rodrigues said developing
    sitemaps is more about creating
    'a better quality of the site's
    index and covering all the disparate,
    eclectic information.'
    The goal of the project is to
    boost the quality of search results,
    rather than the quantity.

    'As federal providers, we have
    a lot of concern about whether
    or not the public is going to be
    able to find our information,
    especially about health information,'
    Rodrigues said. 'We
    know with the ever-growing
    volume of information on the
    Web, it's easy to become lost in a sea of data.

    Reader Comments

    Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

    Your Name:(optional)
    Your Email:(optional)
    Your Location:(optional)
    Comment:
    Please type the letters/numbers you see above

    GCN eNewsletters

    eSeminar