Michael Daconta | My data's not ok, yours is not ok
Reality Check columnist
IN THE 1967 classic self-help book 'I'm OK, You're OK,' Dr. Thomas Harris defined four life positions ' combinations of 'I'm OK-ness' and 'you're OK-ness' ' that determine your success in transactions with others.
Like people, organizations can exhibit these life positions in relation to their information technology and data identity.
In relation to data quality, many seem to act in the 'My data's not OK, and your data's OK' position. That is to say, 'I have a problem, but maybe no one will notice.' It makes people want to ignore or hide their problem ' kind of like hiding crazy Uncle Ned in the attic.
Having witnessed several data quality denials, it is obvious that there is a deeper problem at work here. Why deny data quality problems?
Fortunately, it's OK to admit your dirty little secret: You have bad data. Guess what? So does everyone. For large datasets, bad data ' inaccurate, nonstandard, incomplete, inconsistent and dirty ' is the norm.
The root of the problem is the belief that everyone else is OK so I'd better hide Uncle Ned.
However, if you manage data, you can breathe a sigh of relief, because you're not OK and I'm not OK. Does that mean we do nothing?
No, it means we find out if old Uncle Ned is the harmless kind of crazy or something more serious. Here are three points to help you tackle the problem or Uncle Ned.All data is not equal
Not all your data is worth monitoring and cleansing. Run the other way anytime a consultant says, 'All data can be '.' Not all your data is relevant, not all your data is depended on, not all your data is shared, and not all your data is used to make decisions.
You need a method to properly scope the data you will manage.
At the Transportation Security Administration, we introduced the concept of enterprise data ' data that is common across business units, shared internally or externally, or used for decision support.
Things that are common, shared and support decision-making are top candidates for data-quality initiatives.Line of sight is key
In enterprise architecture, a key technique is to connect performance objectives to business processes to IT investments.
Such direct and clear linkages make it easy to prove business value, and you can apply this technique in judging the needed quality of data.
Can you trace the lineage of a shared data element back to its source? Do you know the impact of changing a common data element used in multiple IT systems? Can you trace backward from a key performance indicator to the data elements used in the calculation?End-to-end tool support has arrived
Vendors have made steady progress in expanding their tool suites up and down the technology stack.
They have reached the point where they can offer a complete, end-to-end solution. I like to call these the dashboard-to-data-source suites.
Several of the top vendors offer integrated tool suites to define, profile, cleanse, transform, deliver and present data.
Some of the vendors have expanded from extract, transform and load tools. Others have worked backward from business intelligence tools.
Regardless of how the tools have matured, they have reached a point where they can help you implement end-to-end information management.
This is important because it is easier to define a process around a tool that walks you through a predefined set of capabilities. This helps you get started and quickly gain traction.
If you keep in mind these three points, you can improve the quality of your enterprise data. Then you can feel confident when your data is shared or presented to decision-makers.
And well-founded confidence, not denial, is definitely OK.
Daconta ([email protected]
) is chief technology officer at Accelerated Information Management LLC and former metadata program manager at the Homeland Security Department. His latest book is 'Information as Product: How to Deliver the Right Information to the Right Person at the Right Time.'