Shawn P. McCarthy | The big picture on metadata
Shawn P. McCarthy
Metatdata, and all the promise and baggage the word carries with it, is again a talking point within the federal government.
But when government metadata is discussed, the issues of standards, compatibility and control usually follow, with various factions championing competing solutions. Thus, the metadata issue must be addressed governmentwide before real data sharing can take root.
Most government information technology managers know that a few years ago the Office of Management and Budget worked with the CIO Council to release the Federal Enterprise Architecture Data Reference Model. One can debate whether it's a true metadata model or just a general structure for broad classification of government data, but its impact has been moderately significant.
The DRM specifies three layers for all data produced, stored or transferred by government organizations: the Data Description (a means to describe data and support discovery and sharing), Data Context (which aids data discovery via taxonomies and directories) and Data Sharing (which enables access and exchange of data via set types of cross-system transactions).
Despite this high-profile model, some government agencies obviously feel that additional metadata structures should be explored.
In late March, the General Services Administration issued a request for information relating to the semantic representation of knowledge (GCN.com/761). That document listed several evolving metadata, classification and sharing technologies that GSA is interested in exploring.
It's not yet clear if this effort will work in concert with the Data Reference Model or if it will take metadata in a new direction. But GSA obviously sees a need to investigate additional structures.
The Network Centric Operations Industry Consortium, which includes several government and industry participants, has issued a report called 'NCOIC Interoperability Framework-Communications,' which recommends a combination of broadcast/multicast support and support for well-known metadata server information. For details, go to GCN.com/762.
There's also the Federal Enterprise Architecture Records Management profile, which requires a uniform method for federal records management. This includes a push from the National Archives and Records Administration requiring agencies to establish procedures for directing records management when approving new IT systems.
At the same time, government agencies are starting to discover a rapidly growing, independent Extensible Markup Language format known as Resources of a Resource. It's a streamlined system for describing any data object in a generic fashion, allowing any search system to understand and document the content.
It's notable that many federal agencies have established a taxonomy of sorts to help government employees and citizens navigate through available information. But these informal taxonomies often are not structured for automated techniques such as query expansion, taxonomy integration or data classification. Without this business context, they're no more useful than a typical yellow-pages directory.
The structures outlined above are capable of complementing, rather than competing with, each other. They have great potential for helping the government store data, or at least information about data. But the real solution lies in constructing multiagency repositories that manage records according to governmentwide rules. Without that, agencies end up maintaining their own sets of metadata and wondering why governmentwide data discovery is so incomplete.Shawn P. McCarthy is senior analyst and program manager at IDC Government Insights, of McLean, Va. E-mail him at firstname.lastname@example.org.