XML looks like a winner, but where are its tag sets?
Shawn P. McCarthy
Do you use XML yet? If not, would a governmentwide Extensible Markup Language tag set help?
Don't expect to see such a tag attribute set soon, but one is surely needed if the government is ever to share XML documents and data efficiently.
Although XML is slowly revolutionizing electronic business, it has yet to make big inroads into government data presented on the Internet.
There are a few exceptions. The General Services Administration is running an XML pilot to provide prices and accept orders in its GSA Advantage online buying system. The agency shows vendors how to code their data'which can stay on their own Web sites'while GSA gathers the tagged data as needed.
The Defense Logistics Agency and others are experimenting with ways to use XML to pull data from legacy systems. But the government has few large XML conversion projects under way. Most agencies took a wait-and-see attitude when XML was first touted as the next great Net language. They continued to wait during the past year as the XML standard was unveiled.
The waiting is over now. XML has proved to be a flexible resource for online presentation of legacy data. It's time to accept it as the dominant Web data format.You heard right
Yes, that's data format, not document format. Most XML documents will be generated on the fly from databases and read by browsers. The documents then disappear, and more data is culled from databases as needed for the next documents.
XML's long-term promise is personalized content views and one-to-one publishing. It will make it easy to design, create and manage databases and data views. XML can create customized content for anyone, drawing from properly tagged data sources on the Web.
That's great, but what about the tag sets? In Hypertext Markup Language, the tag set is fixed. <h1> always means a headline in HTML; a tag such as <product.code> means nothing.
XML allows specialized tag sets. Formal and ad hoc organizations have sprung up to develop tag sets for medicine, education and electronic commerce. But progress has been slow because of the inevitable political battles.
That's why the federal government needs a standardized tag set for documents and databases. Imagine the Commerce Department being able to cull economic data from the Labor Department's Web site, using Labor's specific tag sets across documents and databases.
Agency users will no longer have to contact other agencies' personnel or drill blindly through Web sites to find specific data types. Tags and specialized XML searches will do the work for them.
Who should coordinate development of such a tag set? Two collaborators ought to be the National Institute of Standards and Technology and the Government Printing Office.
NIST is a logical choice because it participated in early definitions of the XML standard and is working on an XML and Document Object Model conformance testing suite. It has a joint development effort page with the Organization for the Advancement of Structured Information Standards (OASIS), a Boston nonprofit consortium dedicated to product-independent data and content interchange. The page has an XML test suite, at www.oasis-open.org/committees/xmltest/testsuite.htm
GPO is a good choice, too, because it deals with such a wide range of printed and electronic documents.
Government Web sites that dynamically serve up content from various sources would do well to start tagging the data in their databases with existing XML tags. Visit www.xml.com
to read about some groups that are evolving widely recognized tag sets and attributes.
Current Web browsers have only minimal ability to recognize XML. Visit support.microsoft.com/support/kb/articles/q223/3/37.asp for details about how Microsoft Internet Explorer parses XML. But browser capabilities are improving. All indications are that the Web is poised for an explosion of XML use next year.
The Gateway to Educational Materials project, at www.geminfo.org
, will use XML to catalog and track curriculum materials stored online across multiple sites. Like a bloodhound
XML eventually will help agencies find data that is trapped in documents and impossible to index. It can retrieve anything, from a physical location to a stock number to personnel information.
Background information about XML is posted at www.w3.org/xml
, the site of the World Wide Web consortium that controls the basic XML specification.
XML expert Robin Cover and OASIS maintain extensive developer resources for XML at www.oasis-open.org/cover/xml.html
. XML background documents appear at www.developer.com/directories/pages/dir.xml.html
. Shawn P. McCarthy designs products for a Web search engine provider. E-mail him at email@example.com.