The trouble with Web pioneering is that time marches on. Pioneers have to keep pulling
more tricks out of their hat to keep up with newcomer sites that flaunt shiny new
Thats how it stands with the Securities and Exchange Commissions once
cutting-edge Web site.
Despite good graphics, the site, at http://www.sec.gov,
looks a bit old-fashioned, and its search tools are weak. Nevertheless, in the current
financial climate, it remains an extremely busy site and a symbol of the governments
commitment to posting public information online.
What the SEC site needs is a face-lift, and a big one is in the works, site managers
said. A cornerstone of the initiative is work on the massive Electronic Data Gathering,
Analysis and Retrieval system, which stores about 70G of corporate financial data. EDGAR
visitors account for 80 percent of the sites traffic, said Chuck Woods, chief of the
Server Development Branch.
A contractor maintains EDGAR. The site is run internally by SEC staff.
Whats in EDGAR? Most corporations are required by law to disclose financial and
other information to SEC, and EDGAR collects, validates, indexes and stores that data.
Visitors to the SEC site can fill in forms to search EDGAR, or they can browse the rest
of the site to find agency information, contact points, investor advice files, details on
filing complaints, small-business information, press releases, proceedings from the
Enforcement Division and details about SEC rulings.
Counting EDGAR activity, the site moves up to 30G of information each daya
massive amount by any standard.
Separate from the Web site, EDGAR has its own workflow management system. Each staff
attorney responsible for a certain industry finds a group of filings in the in-box every
day. On the back end, after the filing is accepted, its forked off for SEC
processing and also sent to a dissemination system for subscribers.
But the EDGAR interface has not changed much from the original set of tools SEC
acquired from New York University for querying EDGAR via the Web. Another negative is that
files available through EDGAR appear as straight ASCII text, not Hypertext Markup
Language. They are difficult to read.
The search interface, a Wide Area Information Server, also is out of date compared with
readily available and more powerful search tools.
In many ways, the SEC site has been a victim of its own success.
Were bigger than we ought to be for the tools we have, Woods said.
The WAIS server is showing its age. Its dying under the load. Volume is such
that nightly indexing takes forever. Plus, it isnt full-text indexing.
Only the headers of EDGAR filings are indexed. On the main site, press releases and SEC
documents are indexed by keyword.
The obvious starting point for a face-lift is to replace WAIS with a more powerful
search engine and convert files to HTML. But even more renovation is necessary. Its
not easy to find specific SEC divisions or contact points, for example.
BDM Federal Inc. of McLean, Va., now part of TRW Inc., holds the contract for
maintaining EDGAR and recently won a follow-up contract, Woods said. As the contractor
rolls out a new system, corporations will begin submitting filings as HTML documents. That
should make them more readable without placing pressure on the system to handle
Woods plans to install a search engine from Verity Inc. of Sunnyvale, Calif.
Veritys tool set ought to make it possible to customize an interface that
streamlines searches across multiple indexes.
Why did SEC wait so long to update that it must now tackle everything at oncenew
contract, new EDGAR system, new functions and new search interface?
Part of the delay arose from getting the machinery in order.
Moving to a distributed environment will make it easier to meet server demands while
rolling out the new pieces. Also, work proceeds slowly on a site that handles legal
documents, to avoid data loss and downtime.
Webmasters at other agencies understand the magnitude of the job SEC now faces. They
may decide it is advisable to make smaller, iteractive changes that keep their Web
+ Graphics are clean, and navigation from the button bar is intuitive.
+The special-purpose search section shows
the SEC staffs ability to build products to serve the public well.
+ The About the SEC section
tells what the commission does and how it came about; it lists acts of Congress back to
the 1930s that define SECs mission.
+ The rule-making section helps visitors
quickly locate proposed rules, final rules, interpretations and notices.
Good information is located in strange places. For
example, pointers to SEC decisions on year 2000 readiness are buried two levels down on a
public statements page. This important issue should get top billing.
The search interface isnt intuitive enough.
Why should a visitor have to enter a full corporate name? Will a partial name yield the
same search results? There are different search forms for different needs, and its
not always obvious which one to use.
For some searches, dates are mixed up and records
are returned in no particular order.
The only search options are last week, last two
weeks or last month. Beyond that, visitors must search the entire database.
The small business section isnt updated as
frequently as it should be. The latest news section was updated last year.
Hits: 500,000 to 600,000 per day
Volume: Up to 30G moved per day; thousands of other sites
link directly to SEC searches
High traffic areas: Main page, search page, enforcement
complaint center where visitors find forms for questions and complaints, occasional hot
areas such as year 2000 information and press releases
Server hardware: Array of small Unix servers with
Pentium Pro processors as the front end to enterprise-class data servers
Search engine: WAIS
Connection: Switched T3
Internet provider: Digex Inc. of
Overall management: Chuck Woods
Content webmaster: Ruth Pitt
Internal Web development: Fran
Server maintenance, security and integration of new search
tools: User Technology Associated Inc. of Arlington, Va.
Original site development: Joe
Segreti and Mark Brickman
Shawn P. McCarthy is a computer journalist, webmaster and Internet programmer for
Cahners Business Information Inc. E-mail him at [email protected].