The Web page as database

The Extensible Hypertext Markup Language can best be thought of as a prim and proper HTML, for better or worse.

XHTML has a lot of 'thou shall not' rules, said Bob DuCharme, a solutions architect at Innodata Isogen, who gave a presentation on the standard at the XML 2007 conference held in December. It forces designers to adhere to the basic HTML formatting rules, such as always typing in lower case or always nesting elements in their proper order.

But this forced discipline will soon produce some serious benefits. Version 2 of XHTML (, now in the last stages of development at the World Wide Web Consortium, sets the stage for 'using a Web page like a database,' DuCharme said.

XHTML allows organizations to create their own custom set of tags, which could be used by workflow software to automate new processes around plain old Web pages and other formerly unstructured documents. For instance, organizations could define their own tags that indicate specific sections of a Web page. With regular HTML, a developer could only guess at where one section would end and another one would begin, perhaps by using headline tags such as <h1> or <h2>. A specific '
' tag could eliminate this ambiguity, simplifying workflow scripts.

For further nuance, XHTML uses its own subset of the Resource Description Framework, called RDFa, which allows organizations to annotate their terms with attributes such as 'about,' 'property,' 'datatype,' 'instanceof,' and others, said Mark Birkbeck of Applied Xforms.

RDFa is being used in other formats too, such as the next version of Open Document Format (Version 1.2), noted Svante Schubert of Sun Microsystems. Someone using an Open Office template embedded with your organization's RDFa tags, for instance, could e-mail an attached invoice to your organization, and the software on your end could automatically process that invoice.

About the Author

Joab Jackson is the senior technology editor for Government Computer News.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected