Would a governmentwide XML schema registry cut duplication?

'Even if Core.gov is suitable as a repository, it seems to me you still need a registry to let people find the components they need.'

'Owen Ambur

The Federal CIO Council wants to keep agencies from reinventing the wheel as they adopt Extensible Markup Language.

Owen Ambur, who co-chairs the CIO Council's XML working group, argues that a governmentwide XML schema registry, open to all agencies, would be a big help in describing data elements that already have XML names, instead of writing them anew.

'It is far harder than it should be for folks to discover and reuse data elements, not to mention actual instances of data,' said Ambur, a systems analyst for the Fish and Wildlife Service. 'They are often left with no practical choice but to reinvent the data elements they need.'

Others, however, believe a federal schema registry would be redundant. An existing interagency collaboration space, Core.gov, already can do that job, said Marion Royal, a components expert in the General Services Administration's Office of Governmentwide Policy.

Core.gov, overseen by the Office of Management and Budget, was designed as a repository and collaboration space for sharing components. And the CIO Council's Emerging Technology Subcommittee takes a broad view of what constitutes a component: anything from a small Java script to an entire e-government initiative. XML schemas can be considered components.

Core.gov 'looks at the broad scope of not only component-based development but also service-oriented architectures,' Royal said.

But Ambur said the federal clearinghouse for components of all sorts cannot work as a registry for schemas.

'Once we have a bunch of XML elements and schemas [in Core.gov], will people be able to readily discover them?' he asked.

XML provides an open format for sharing information among different systems. In order for systems to share, they must have a common set of data definitions. And as agencies use XML in ever-more complicated ways, they inevitably start writing schemas'dictionary definitions or labels'to keep all the different types straight.

No one is enforcing a common terminology to eliminate the slight variations that stymie cross-agency data exchanges. And no one is watching out for duplication of work, Ambur said.

The President's IT Advisory Committee recently released a report, 'Revolutionizing Health Care through Information Technology,' which said medical systems at the Defense and Veterans Affairs departments and the Indian Health Service all use different terminology for data elements such as blood pressure measurements over time.

'Lack of agreement on these standards prevents sharing of interoperable data,' the report concluded. Not all the systems studied were XML-based, but a number were.

Existing registries

XML registries already exist at the Defense and Justice departments, the Environmental Protection Agency's Environmental Information Exchange Network and the IRS. Individual offices can consult their agency's registry to make sure their terminology agrees.

None of the existing registries can serve a governmentwide audience, however. In 2002, GSA's Office of Governmentwide Policy contracted with Booz Allen Hamilton Inc. of McLean, Va., to develop a business case for building a registry for the entire government.

Booz Allen Hamilton estimated that building an XML schema registry would cost about $7.7 million, with a total operational cost of around $59 million over a 10-year period.

A registry could be executed in one of two ways, the consultants said. One way would be a central repository, but another would be a federated model, with each agency housing its own schemas for access from a central portal.

The administration's fiscal 2004 budget allocated $2.1 million to GSA to build the registry, but Congress deleted the funds. GSA concentrated on developing Core.gov instead.

'The business case for a governmentwide registry makes sense if you apply it to a specific technology, but it also makes sense for a broad component repository,' Royal said. 'As we look at higher layers of components, it's not just XML schemas, it's also registering and managing the business processes.'

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above