An XML registry is key to sharing data

Owen Ambur, co-chair of the XML Community of Practice, says agencies are in the early stages of putting XML to use.

Rick Steele

When the Federal CIO Council created its Emerging Technology Subcommittee in August 2003, Extensible Markup Language was high on its list of priorities. XML promises to help agencies share data more easily and devise and manage their enterprise architectures. But the work involved in making XML easy for agencies to use remains considerable.

Owen Ambur, chief XML strategist for the Interior Department, is co-chairman of the XML Community of Practice, along with Lee Ellis of the General Services Administration's Office of Governmentwide Policy. GCN writer Joab Jackson interviewed Ambur about the promise'and perils'XML holds for agencies.

GCN: In your estimation, how far along are most agencies in terms of using XML?

Ambur: Most agencies are still in the early stages of beginning to take advantage of the potential of XML.

GCN: What do you see as the chief benefit that XML will bring agencies?

Ambur: XML is just a syntax, a way of structuring data. However, simple things succeed whereas complex initiatives generally fail. The genius of the Web was the simplicity of the standards that enabled it.

Now that the use of HTML is pervasive, the opportunity presents itself to make the Web smart. XML contributes toward that end by making it easy to share data, the elements of which have been clearly defined and are well understood. Thus, the benefit of XML is to facilitate the sharing of data agencies need to conduct business efficiently and effectively.

GCN: What's the most pressing problem for agencies in implementing XML?

Ambur: XML's greatest strength is also its greatest weakness: ex- tensibility. Anyone can create an XML vocabulary, and that's good because different agencies have different data requirements. However, it can also be a problem to the degree that agencies are us- ing different terms to express the same concepts.

GCN: What XML projects have you seen that could serve as models?

Ambur: I'm not sure that any agency would claim yet to have a best practice for proper implementation of XML. But some are ahead of others in using good practices.

For example, the Navy has been a thought leader in fostering XML naming and design rules as well as issuing enlightened policy. And the Justice Department has done a good job of bringing together the global justice community to agree upon a common vocabulary for important law enforcement concepts and data.

We also have high hopes for the core-data-types focus group at the Homeland Security Department with respect to the specification of XML data elements that are commonly used across agencies and applications.

GCN: What long-term projects does the XML Community of Practice track or participate in?

Ambur: When the XML Working Group was initially chartered in 2000, we were explicitly instructed not to take on operational tasks. Subsequently, the CIO Council re- organized with the intent of be- coming more operationally oriented, and when the XML Working Group was rechartered [in September 2002], we were asked to identify XML-related projects benefiting multiple agencies.
Foremost among those is the XML registry.

GCN: How important is developing a federal registry?

Ambur: It is far more difficult than it should be to discover and reuse XML data elements, vocabularies and schema that have al- ready been specified by others. Given the difficulty, agency leaders who have moved out forthrightly to specify the XML elements required to conduct their own agency's business are to be commended, not criticized.

We need to make it easy for such enlightened leaders to reuse core data elements and schema that represent concepts that cut across agencies, applications, systems and lines of business.
In support of the registry, the National Institute of Standards and Technology put up a pilot and the General Services Administration contracted with Booz Allen Hamilton Inc. of McLean, Va., to draft the business case.

The business case established a return on investment in the range of 530 to 555 percent for the preferred alternative, which is a federated set of interoperating registries, and an even higher ROI'in the range of 1,300 to 1,400 percent'for a single, centralized registry.

Based upon the business case, $2.1 million was included in the president's budget for the registry. Unfortunately, Congress failed to appropriate the necessary funding.

Subsequently, when the working group was rechartered as the XML Community of Practice [in September last year], we were directed to focus on providing support for the CIO Council's emerging technology lifecycle management process. Soon we will be bringing up the ET.gov site to provide the means to identify and begin to build communities of practice around emerging technology components.

Hopefully, the site and the process will help to accelerate the wise and effective adoption and use of emerging technology components by government agencies.

GCN: I've heard the argument that program managers shouldn't worry about XML because next year it will be replaced by some other new technology. How is XML different from another case of 'this year's fad'?

Ambur: No one would argue that XML is the be-all and end-all of data architecture. But XML was expressly designed to have some attributes to ensure survivability.

First, it is merely plain text, so it will always be readable by human beings. Second, its structure makes it readily readable by machines as well as humans.

Those attributes make it the clear choice for records warranting long-term retention, and the fact that XML is an open standard ratified by the World Wide Web Consortium makes it the first choice for sharing data across agencies, applications and systems in the near-term as well.

Finally, by virtue of its openness, human and machine readability, and structure, it will be relatively easy to convert XML documents into any other format deemed more desirable in the future.

Indeed, the W3C's Extensible Stylesheet Language Transformations standard already provides for transforming XML documents into other formats, such as the Adobe Portable Document Format.

GCN: At the XML Community Practice meetings, there was a lot of talk about getting the terminology of the schema elements correct. Why is this important?

Ambur: It is important not only to get the semantics of each element correct, in order to avoid confusion, but also to determine which sets of elements appropriately 'hang to- gether' in XML schema fragments to facilitate and maximize reusability in contexts other than the one in which the elements and schema were originally developed.

GCN: When did you first come across XML?

Ambur: I don't recall when or how I first heard about XML, but I do remember what I suggested the government should do with it. I suggested that all government forms should be rendered in XML and the data from them should be gathered in XML, and I also suggested that XML metatags should be used to classify and manage electronic records governmentwide.

At the time I made those suggestions, I had no reason to dream that XML might become the native format of the records themselves.

I give Microsoft Corp. credit for helping to convince the market in that regard, by supporting XML natively in its Office Suite. I also give Adobe [Systems Inc.] a lot of credit for the vigor with which they are now incorporating XML into their product suite. In particular, they are to be lauded for their leadership in AIIM's initiative to specify an international standard for Integrated Enterprise Content Management, which will leverage the potential of XML.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above