Mike Daconta

COMMENTARY

With info sharing, context is as important as content

A narrow focus on content can prevent programs from making the connection

Daconta is chief technology officer at Accelerated Information Management LLC and former metadata program manager at the Homeland Security Department. His latest book is entitled, “Information as Product: How to Deliver the Right Information to the Right Person at the Right Time.”

In commenting on the recent Shirley Sherrod incident, Fox News host Glenn Beck, among many other news commentators, said “context matters.” Context, which is the environment or situation surrounding a particular target, is also a critical component of federal data architectures that needs to be planned and implemented before an incident occurs in which it is needed.

Every chief information officer must ask, “How do we create a data architecture that captures the context around our federal records?” A rich context enables you to assemble a complex picture on the fly because you know where the puzzle pieces fit. In fact, context is so critical that I see its implications and implementations cropping up in more than a half-dozen areas of data architecture, such as process monitoring, business glossaries, master data management, metadata catalogs, information exchanges and data warehousing.

In this article, we will examine three information management projects in which context plays a key role in the solution.


Related stories:

10 flaws with the data on Data.gov

The next wave of semantic applications


First, I was recently reviewing some Extensible Markup Language schemas that represented an important government form. The schema defines the document for the applicant’s responses, which will be reported to multiple agencies that need to take action on those responses. Unfortunately, the XML design failed to take into account anything outside that single form, while the context of the data on the form was relevant to a number of external systems and indirect consumers.

The problem is that a narrow, project-only perspective fails to capture the semantics needed for a larger audience. A simple example would be to think only about a single role that a person plays rather than thinking about a person who can play multiple roles — for example, employee or parent could be different roles of one person. I cannot overemphasize how critical this is to effective information sharing.

Second, I am supporting a master data management project that is integrating asset data across multiple information technology systems. The integrated product team is connecting data via its financial and geospatial context. Let’s briefly ponder those two contexts: One involves a source — in this case, a source of funds — and the other involves a static element. Sometimes, the context of the data can even be more important than the data itself. That is especially true when you are primarily concerned with connections among things.

Finally, a new contextual approach is emerging for developing information exchanges that holds great promise. Like the XML design problem of our first example, information exchanges can fall prey to an overly narrow perspective. Given that modern development platforms can automatically generate code to process XML documents, a narrow perspective can affect the exchange and any code that processes that exchange. The new approach being spearheaded by forward-thinking elements of the Army and Air Force is to create the semantics first, via a high-fidelity data model called an ontology, and then generate the XML schemas from that model.

Although not based on the Web Ontology Language, the National Information Exchange Model takes a similar approach, in which the XML schemas are generated from a database-backed data model. The contextual nature of this approach is that the ontology uses a more top-down, enterprise perspective to guide the inclusion of bottom-up exchanges.

The heightened awareness and use of context were mirrored on the commercial front by Google’s purchase of Metaweb and the company’s Freebase entity graph.

The elevation of context in our information management activities is a sign of a more aggressive attitude toward actively managing our data so that we can take advantage of its potential. The key to mastering context is to understand the role of metadata in your organization and how to effectively design it. Simply put, metadata captures context, whereas your data is content. If you can trace a line from your content to its context and then the consumer, you will have mastered your information.

Reader Comments

Mon, Aug 30, 2010 Peter Benson Bethlehem, PA

As the project leader for ISO 8000 the new international standard for data quality developed with DoD funding and a regular participant in the data and information quality forums, maybe I can shed some light on the issue. In the first place it helps to be able to consider data and information separately. Context is clearly an information issue. Quality data may not yield quality information but quality information can only be based on quality data. ISO 8000 defines data quality in terms of syntax (there must be one), semantic encoding (must be unambiguous and permanent) and the all important "meets requirements". To these fundamental characteristics of data quality are added assertions or warranties of provenance, accuracy and completeness. Quality information has to start with quality data, what we have found that data quality is highly dependent on the quality of the expression of the data requirement. A form is a data requirement statement and as a first stem we need to make sure all government forms are at the very least ISO 8000 compliant. There is now an open technical dictionary containing over 2 million concepts with public domain identifiers, using this dictionary is the first step in creating quality "portable" and mappable data. For more information see eccma.org

Tue, Aug 10, 2010 Richard Ordowich

I have never seen a database-backed model for NIEM. NIEM provides data elements and XML schemas for data exchanges.

Mon, Aug 9, 2010 Loretta Mahon Smith Washington, DC

More than just context matters...there are three "c" that a CIO needs to keep in mind: Content -- Context -- Continuity Content: Is the data complete and correct? Context: What does the data really mean? How can it be used and reused in a controlled and know fashion? Continuity: How available and accessible does the data need to be over time? How is that ensured? DAMA International is an organization of business and data management professionals that have collaborated to create a body of knowledge about data management best practices that your CIO and other technical readers can leverage to create data architectures that are stable and expandable. See http://bit.ly/arolzn for the DAMA Guide to the Data Management Body of Knowledge, and to find out more about this community that has been working with content, context, and continuity for both structured and unstructured data.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above