XML can solve data distribution problems but not political ones

Shawn P. McCarthy

The federal government has been slow to explore the possibilities of the Web's Extensible Markup Language on a large scale. More is involved than fear of the unknown or simple inertia.

XML promises something close to salvation for federal content managers and webmasters. It can tag and identify specific types of data within documents. It provides a way to embed 'includes' in dynamically produced content that can pull the latest data out of a central database, wherever it's located.

I've been an XML cheerleader and have wondered several times in this space why no one is developing a federal XML tag set for trading data around government networks in real time. I've come to see there are broader concerns that have to be addressed first. XML will not take off in the government until that happens.

Lack of browser support. This was the biggest drawback a year ago, but Microsoft's Internet Explorer 5.0 gave limited support to XML, and now Netscape 6 supposedly provides full support. Lack of XML browsers is no longer an issue.

Competition. Perhaps you work for the Health and Human Services Department and collect data about the proliferation of toaster mites at bagel shops. You discover that someone at the Consumer Product Safety Commission collects similar statistics, as does someone else at the Agriculture Department. Who should be the authoritative source? Should there be a single source? Sharing data via XML means designating one central source to eliminate duplication. But is the government ready for that? Are consumers of government data ready to accept it?

Loss of control of the data. Once data is pulled from a database, it sometimes takes on a life of its own. Site developers must set strict standards for how XML data is gathered and displayed to make sure they tap the right source for the latest data. Agencies that cache data have to ensure that caches get updated as needed, preferably daily.

Lack of quality control. This is similar to the control problem but focuses more on the integrity of the tagged data as it moves around. If you are the central repository for certain data, can you rely on those who use your data to display it correctly? Will it be quoted in context? Will the tables line up correctly? These issues are big, and they're usually avoidable when you control both the data and the display of it.

Access control and security. If your agency becomes a central repository for XML data that happens to be sensitive, you could get into a position of having to authenticate thousands of user sessions. Are you set up to identify each connection and transfer data over it securely? There are several ways, but you probably can't support every type of secure transfer. If you want to be a central XML repository, are you able to dictate how every connection is made?

Availability. Building a government-wide, real-time, central repository brings its own set of challenges. Do you already operate fault-tolerant servers with 99 percent uptime? The more users who rely on you for data, the more important it becomes to deliver without glitches. Beyond simple availability, do you have a disaster recovery plan?

General maintenance. Problems down the hall are easy to solve. Integrated XML may present problems with servers, offices or data located across the country. The problems loom larger across time zones.

In time, all these issues will get settled and government adoption of XML will move forward. But it will take time and patience. When properly and painstakingly implemented, XML can solve a lot of information availability problems for agencies, notably Freedom of Information Act requests. But XML can't resolve the political problems above.

Shawn P. McCarthy designs products for a Web search engine provider. E-mail him at [email protected].


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected