XML tools can tune up e-gov efforts

XML tools can tune up e-gov efforts<@VM>A host of XML development tools can add life to your Web pages

Vendors still have to agree on standards, but the Extensible Markup Language can give currency to e-commerce plans

BY JOHN MCCORMICK | SPECIAL TO GCN

As the Web evolves toward greater use of electronic commerce, Extensible Markup Language has emerged as its lingua franca in waiting. XML provides a level of functionality beyond that of Hypertext Markup Language and a common way to identify data in transactional documents.
But a greater ease of use doesn't necessarily mean converting to XML is easy. Nor does it mean that HTML will be left behind.

The simplest way to understand XML is to understand how it differs from HTML. Both are subsets of Standard Generalized Markup Language, but where HTML concentrates on style, XML focuses on substance.


If your data is incompatible now, merely formatting it with XML tags won't magically make it compatible.


HTML was developed to format documents moving across networks, particularly the Internet, and format paragraphs, headings, lists and simple illustrations for the Web.

Flying blind

But HTML is limited in its ability to describe the kind of data displayed. An order form designed in HTML isn't really aware of the data it contains, so the data is difficult to transfer to other applications.

XML gets into the content. It is a metalanguage'a language describing how another language is structured'used to define the data structure of documents and make it easier to exchange data between applications.

XML lets programmers create documents that describe the various data fields to allow the information to be used by other programs, similar to the way database entries do.

In XML, a page developer can define new tags; in HTML, tags are predefined. And in XML, software can easily mine the data over a network'as long as all the programs recognize the same tags.

Because of its SGML heritage, XML documents can be parsed and validated the same way any SGML file can. XML is better suited for general use because it discards some of the most complex features of SGML.


The Lowdown

' What is it? XML is a metalanguage that uses tags to describe the content of data in a document in much the same way that HTML formats the appearance of a document. Both are subsets of SGML.

' Why would you need it? XML can serve as the glue between various applications, making it easier to exchange data among products and to publish information in an accessible manner. It also can make creating Web pages easier for the developer.

' When wouldn't you need it? If you don't publish or gather data over the Internet and aren't currently developing an integrated information management solution with components from various vendors, then you can probably ignore XML.

' Price? The tools for implementing XML vary greatly and so do their prices. They range from free, downloadable XML editors'text editors with XML syntax built in'and other development tools to full content management suites that can cost tens of thousands of dollars.

' Status? Some of the most critical parts of XML, including the schema that will make it easy to exchange data among various applications from different vendors, are not yet standardized. So XML cannot instantly integrate incompatible data. But many vendors have promised to follow the World Wide Web Consortium's recommendations when they are finalized, which could happen as soon as this spring.

' Must-know info? As usual with a new technology, the hype exceeds the reality. In this case, XML can do all that its proponents claim'the exaggeration comes in the claims of how easy it will be to implement.



HTML can do many of the things that proponents tout as XML features but only through a bewildering array of sometimes incompatible plug-ins. As the Web has grown, HTML has been unable to grow with it except through third-party add-on utilities.

But HTML won't be riding off into the sunset any time soon. It does some things better than XML'and does some things XML cannot do'so there is no move to replace it.

A new World Wide Web Consortium (W3C) standard, XHTML Version 1.0, is a combination of HTML and XML that allows Web sites to add XML to their pages but lets some old browsers continue to view the content.

XML promises:

' A simple way to exchange data over the World Wide Web

' A new beginning for programmers who have pushed HTML to the limit

' A new paradigm for querying databases over enterprise or local networks

' An easy way to exchange data between various local applications

' A solid middleware standard that makes XML distributed computing possible because applications can access distributed objects across the Web.

But what XML can do today is wildly different.
A lack of standards is one problem. But there is also a fundamental problem with the entire XML concept that must be reconciled before it becomes a transparent data exchange tool.

The biggest hurdle to adoption of XML is its most important feature: the ability to create new tags. This ability lets users customize XML documents to exactly fit their needs. But unless you are developing an in-house system, custom tags won't be recognized by other applications, virtually eliminating XML's usefulness.

Vendors producing various XML products create Document Type Definitions that describe their product's metadata schema. A validating editor or processor uses the DTDs to determine if a particular document contains valid XML code.

Because this is such a flexible process, there is no standard set of definitions and therefore no way to exchange data among different applications unless they share identical DTDs.

There are published sets of DTDs but, because there are many different sets, none of them can be described as a standard.

No magic bullets


The ABCs of XML

' Document Type Definition (DTD)'the language created by XML producs that describes the contents of SGML or XML documents.

' XML schema'a superset of DTDs written in XML syntax and created with XML tools.

' Document Object Module (DOM)'an application programming interface for accessing HTML and XML documents from a Web browser.

' Extensible Stylesheet Language (XSL)'the XML equivalent of the Cascading Style Sheet used with HTML, which XML also supports. XSL is an XML-specific style sheet format.

' Parser'a tool that finds XML tags, extracts the information they contain and passes them along to applications.

' Editor'a text editor that works within XML.


If your applications can share data before you incorporate XML, then all probably will be well. But if your data is incompatible now, merely formatting it with XML tags won't magically make it compatible.

The best option today is to use mapping tools that can read DTDs from one vendor and translate them into another vendor's DTDs. But is this what you thought XML promised?

XML is touted as an easy, seamless way to share data among disparate applications. But when you have incompatible data sets requiring XML coding on both sides, with a layer of translation from mapping tools in between, 'easy' could be a question of degree.

In most cases you will have to analyze the two DTD sets and create the mapping. Commercial translation tools don't understand the various DTDs; they just manage the routine translation after you do the work.

The W3C XML Schema is intended to define the structure, content and semantics of XML documents, superceding SGML DTDs.

The XML Schema reached candidate recommendation status in October, meaning that it now can be implemented on the Web and users can develop feedback on the specification.

The XML Schema is necessary to teach XML systems how to parse data to recognize, for example, the difference between a string of text and a Social Security number.


For more on XML, go to...

' www.biztalk.org'BizTalk, a group organized by Microsoft Corp. that promotes XML.

' www.ebxml.org'ebXML is a joint effort of the United Nations Center for Trade Facilitation and Electronic Business and OASIS (see below) for creating a set of specifications for electronic business.

' www.isoc.org'The Internet Society.

' www.oasis-open.org'OASIS, the Organization for the Advancement of Structured Information Standards.

' www.voicexml.org'VoiceXML Forum, a group sponsored by AT&T Corp., Lucent Technologies Inc. and Motorola Inc.

' www.w3org/XML'The World Wide Web Consortium's XML page, site of the most official XML information.


Until the XML Schema reaches full recommendation status, which is expected by spring, vendors and industry groups will continue to create their own DTDs.

DTDs have become well-entrenched in the user community simply because it has taken more than two years to develop the XML Schema.

Microsoft Corp. and other companies have said they will follow any W3C standard once it reaches the recommendation level.

The software available for XML ranges from full-blown software development kits to free editors and other tools. IBM's Alphaworks, for instance, publishes a lot of cutting-edge XML products for free download.

Editors are simple and valuable tools'they are text editors with XML syntax built in.
Software development kits offer the complete set of tools for generating valid XML code, integrating XML into your present operations and even helping to migrate existing data or applications to XML.

Although XML is still a few steps away from fulfilling its promise, several agencies are already making use of it.

The Patent and Trademark Office, for instance, is using Infrastructures for Information Inc.'s S4/Text to develop an XML system for authoring patent applications that can be filed over the Internet.

The Securities and Exchange Commission, the Department of Defense, the IRS and the Government Printing Office also are creating XML documents for such uses as online forms, document exchange and online training.

John McCormick, a free-lance writer and computer consultant, has been working with computers since the early 1960s.





























































































































































































































VendorProductPlatformPlatformDescriptionPrice
Adobe Systems Inc.
San Jose, Calif.
408-536-6000
www.adobe.com
FrameMaker + SGMLWindows, Macintosh, UnixXML publishing$1,449
Artesia Technologies Inc.
Rockville, Md.
301-548-4000
www.artesiatech.com
TEAMSUnixEnterprise digital asset manager$100,000 up
Bluestone Software Inc.
Philadelphia
610-915-5000 x3052
www.bluestone.com
Visual XMLNT, SolarisIntegrated development environmentFree download
The Breeze Factor LLC
Encinitas, Calif.
888-547-5620
www.breezefactor.com
Breeze XML StudioJavaJavaBeans from XML creator$495 up
Chrystal Software Inc.
San Diego
858-676-7700
www.chrystal.com
AstoriaSolaris, WindowsEnterprise authoring tool$5,000 up
CueSoft.com Inc.
Brighton, Colo.
303-637-9807
www.cuesoft.com
Exml EditorWindowsXML editor, tree or source viewFree
CUEXml ActiveX 2.2WindowsXML parser and DOM components$79
DataChannel Inc.
Bellevue, Wash.
425-462-1999
www.datachannel.com
DataChannel ServerNT, SolarisXML enterprise portal$100,000 up
Eidon
Lakeville, Minn.
612-461-4238
www.eidon-products.com
eidonXbaseWindows, SolarisDatabase content management tool$20
IBM Corp.
Armonk, N.Y.
914-765-1900
alphaworks.ibm.com or www.software.ibm.com for DB2 XML Extender
XML for C++Windows, Solaris, Unix, LinuxC++ libraries with classes for parsing, generating and validating XML documentsFree
XML and Web Services DENT, Win 2000XML Web development toolsFree
XML Diff and Merge ToolAIX, Linux, NTReconciles, compares and merges XML documentsFree
XML EnablerJavaMakes data viewable by any browserFree
XML GeneratorJavaGUI DTD editorFree
XML SecuritySuiteWindows, LinuxSupports digital signatures, encryption and access controlFree
DB2 XML ExtenderWindows, Unix, Linux, Solaris, AIXAdds XML document support to DB2Free
Infoteria Corp.
Beverly, Mass.
978-922-4029
www.infoteria.com
iPexC++ libraries for Windows, UnixC++ XML processing engine$150 up
iMessengerNTRetrieves XML data e-mail$1,200 up
Infrastructures for Information Inc.
Toronto
416-504-0141
www.i4i.com
S4/TextMicrosoft WordForms XML editor working with Word$7,500(includestraining)
Interwoven Inc.
Sunnyvale, Calif.
408-774-2000
www.interwoven.com
Ajuba2NT, Solaris, Red Hat LinuxXML author, server and manager30,000 up
Intranet Solutions Inc.Eden Prairie, Minn.
800-989-8774
www.intranetsolutions.com
eXpedioNT, Win 2000, Unix, Linux, SolarisContent management and publishing$40,000 up
Microsoft Corp.
Redmond, Wash.
206-936-7329
www.microsoft.com
Microsoft XML Parser in JavaWindowsXML parserFree
Biztalk Jumpstart KitNTUses Biztalk schemas and Microsoft software to build Web pagesFree
Open Text Corp.
Waterloo, Ontario
800-499-6544
www.opentext.com
AelfredJavaDTD-aware XML parserFree
SaxWindowsSimple API for XML driverFree
Near and Far Designer XMLWindowsXML modeling and authoring tool$99 up
SoftQuad Software Inc.
Toronto
416-544-9000
www.softquad.com
XmetaL 2.0WindowsXML editor$495
Software AG USA
San Ramon, Calif.
925-242-3700
www.softwareagusa.com
XML Starter KitNT, UnixDatabase and development environmentFree 90-day trial
TaminoNT, Unix, Linux, OS/390Native XML database server$25,000 per CPU; $40,000 for Solaris

inside gcn

  • pollution (Shutterstock.com)

    Machine learning improves contamination monitoring

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group