Hypertext Markup Language is a markup language written in SGML for use on the Web. Many
agencies and companies publish their electronic information in HTML on the Web. Web-based
electronic commerce, quickly becoming important in the private sector, is of increasing
interest to agencies.
The current World Wide Web Consortium standard is HTML 4.0, but many browsers can only
read older versions of HTML, and other browsers support proprietary tags that have not
been accepted into the standard. The next generation of HTML proposed for consideration by
the consortium will implement HTML as a set of XML tags, allowing for a graceful
transition to the more powerful XML language.
HTML authoring tools range from simple text editors to sophisticated visual Web design
Web site management tools, such as Microsoft FrontPage, can, besides authoring pages,
automatically generate navigation links between pages in a site, standardize styles across
pages and maintain links when page names change.
VBScript. Specialized tools, which are not discussed in this guide, help Web programmers
develop server-side Web applications and dynamically incorporate database records in Web
Because HTML is an application of SGML, an SGML authoring tool can be used to edit
HTML, given an HTML DTD. Because most HTML-specific authoring tools support scripting,
uploading and link maintenance as well as tag and content creation, they are better suited
than SGML authoring tools to creating Web sites.
This Buyers Guide lists SGML and XML authoring and formatting tools available in the
United States, plus a small sample of HTML authoring and conversion tools.
Agencies use HTML primarily for Web sitesboth internal intranet sites and public
sites on the Internet.
XML has not yet really arrived in government, although it is likely to be in use within
the next year as, for instance, a way to automatically connect Web sites to SGML documents
SGMLs highly structured nature makes it suitable for creating searchable CD-ROMs.
Government, especially DOD, uses SGML for document storage and publishing. SGML
publishing works well within the CALS framework, where large documents with relatively
simple standard formats are the rule. Several limits crop up in document formatting,
Some SGML formatting engines have trouble generating acceptable multicolumn layouts,
wrapping text around illustrations and correctly formatting CALS tables and equations.
None of these is a problem in conventional desktop publishing environments, but such
environments dont address the long-term stability issues that prompted creation of
Even editing tables and equations can be problematic. Compared with table and equation
edit functions in desktop publishing programs, those in SGML authoring programs often seem
primitive. Given the large number of tags and attributes generated by the editors and the
number of SGML table formatting standards, however, one can easily understand the problems
developers have faced and overcome.
SGML documents are often put to multiple purposes: The same document may be destined
for a print publication, a CD-ROM and a Web site.
This can present a problem. Many more SGML tags are required for a good searchable
CD-ROM than are needed for a printed document, in which excess tags can add significantly
to the documents development cost and cause maintenance headaches later.
Some systems can automatically convert SGML documents to Web pages and other viewable
documents. The quality may vary, however, and it may be necessary to hand-tune HTML pages
each time they are generated to attain the highest quality Web pagesnot a viable
option when thousands of pages are involved.
Some systems offer a way to view SGML documents directly from a Web browser, using a
One common problem is that external contributors and editors dont have SGML
authoring systems and cannot deal with the SGML tags in text documents.
As a result, tags can be lost in the revision process and must be re-entered when
revisions are merged into the master document. Some agencies deal with the problem by
having external contributors do revisions on paper, and the internal publications
department type the changes directly into the SGML system.
Some SGML authoring systems cannot deal with partial or invalid documents. So even if
an agencys field office or contractor can do SGML editing, interchange problems may
Contributors to a document may be restricted to seeing only part of a document, either
for security reasons or to avoid multiple revisions being made to the same document
section. To allow such multiple levels of access, it is sometimes necessary to extract
subdocuments from the master document and create a full set of context structure tags to
make valid documents for the working DTD. Some systems do this more easily than others.
Some systems also track revisions better than others do. If documents youll be
creating and publishing typically have long lifecycles with frequent revisions, be sure to
check a packages document comparison and revision marking
lWhen publishing Standard Generalized Markup Language documents, you must
choose between formatting from pure SGML markup and formatting with additional filtering
ArborTexts Adept Publisher, available for Unix and Microsoft Windows NT 4.0,
takes the former route and generates PostScript directly from a documents SGML,
Document Type Definition and Formatting Output Specification Instance using a multipass,
rule-based engine. The trade-off is you get more automatic publishing but less control
over the end product.
Adept Publisher is an appropriate printing engine for large, simply formatted
documents, including most of the Defense Departments Continuous Acquisition and
Lifecycle Support documents. It handles complex index and cross-reference structures and
does a good job with revision marking.
Adept Publisher includes all the functions of Adept Editor, a highly configurable
authoring system for pure native SGML and Extensible Markup Language.
In its default view, Adept displays two editable panes, a document map and an edit
view. The edit view has most of the features of desktop word processors, and it deals with
SGML tags in several ways.
You can specify tags to be viewed or hidden in the document. A quick-tag entry menu
helps you to create only those tags valid in the current context. Even when tags are
hidden, empty tags are displayed and highlighted to let you fill in the missing tag
contents. When searching a document, you can find text inside specific tags if
When dragging an element within a document, cursor cues indicate whether you can move
the element, where to drop the element, where the context would change the tags and where
the element would be invalid.
An external equation editor pops up when you add or edit an equation; it includes a
palette of equation symbols.
Adept takes its commands from menus, toolbars, dialogs and a command line. You can
customize menus, toolbars and dialogs. You can automate publishing via Adept Command
Language and compiled .dll files that tie into Adepts object model.
An add-on product for developers simplifies extensive customization and
automation.Adept does a good job of handling partial documents and invalid tags, and of
importing and exporting XML, and it integrates with six document management systems.
Adept Publisher sells for $2,350; Adept Editor is for $1,350. More details on both
products are available at www.arbortext.com.
Contact ArborText Inc. of Waltham, Mass., at 781-529-1000 or 734-997-0200.
Martin Heller is a software developer, consultant and writer in Andover, Mass.