Government’s new second language: XML
- By William Jackson
- Sep 20, 2012
When the Government Printing Office updated its composition engine for digital publishing, the ability to accept XML input directly was a key requirement.
The Extensible Markup Language allows documents to be encoded in a format that is both human- and machine-readable, making it a good choice for documents being accessed over the Internet. It allows richer metadata and improved search capability, said GPO CTO Ric Davis.
“It has a lot of advantages for us internally in terms of process,” Davis said.
New software gets GPO over the XML hurdle
XML can foster interoperability and data sharing and has become the go-to format for information in emerging electronic government initiatives. In a speech last year to IT industry officials, federal CIO Steve VanRoekel advocated the use of the language by government.
“I envision a set of principles like ‘XML First,’ ‘Web Services First,’ ‘Virtualize First,’ and other firsts that will inform how we develop our government’s systems,” VanRoekel said. “They will effectively establish a new default setting for architecting solutions government-wide, and they will be continuously updated as new technologies emerge to ensure that our government is at the frontier of advancements that yield a higher return on our IT investments, increase productivity, and improve the way the government interacts with the American people.”
The Federal CIO Council has established an XML Working Group, “to facilitate the efficient and effective use of XML through cooperative efforts among government agencies, including partnerships with commercial and industrial organizations.”
XML was developed as an offshoot of the ISO’s Standard Generalized Markup Language by the World Wide Web Consortium, an international standards-making organization for the Web.
Work began in 1996 and the XML 1.0 specification was finished in 1998. It has gone through five editions in 10 years without changing the version number. A second version, XML 1.1, was published in 2004 intended to make it easier to use, including the use of additional characters. Both versions are in common use.
A markup language is a standardized way of annotating documents in a way that defines its format and characteristics when published. XML uses Unicode text characters, and an XML parser passes the markup characters set off from the text to an application for use. One of the design goals of XML is that it should be easy to write programs that process XML. The markups identify elements of the document and define attributes.
William Jackson is a senior writer of GCN and the author of the CyberEye blog.