BP spill's lessons about data and transparency
Perspective often affects how data is presented
Michael Daconta (firstname.lastname@example.org) is the chief technology officer of Accelerated Information Management and the former metadata program manager for the Homeland Security Department. His latest book is, “Information as Product: How to Deliver the Right Information to the Right Person at the Right Time.”
As I write this, the Gulf Coast is still scrambling to deal with the onslaught of oil gushing from a wellhead on the ocean floor. Although I do not wish to deflect from that primary narrative of capping, containing and cleaning the oil spill, I want to examine some of the events and activities from the perspective of information management and extract some lessons learned.
There has been a lot of controversy regarding the initial and often revised estimates of the amount of oil spilling into the Gulf on an hourly and daily basis. Estimates have ranged from an initial BP estimate of 1,000 barrels a day up to scientists’ estimates, later on, of 50,000 to 100,000 barrels a day. The most recent estimate, as of this writing, is 25,000 to 30,000 barrels a day. The real issue here is in the magnitude of difference between the estimates, the reasons for the difference and the ramifications for the idea of transparency.
In my opinion, the statistical data and dueling algorithms that created the estimates are a mere sideshow. More important is the notion of how perspective affects data and how we can apply that same lesson to Data.gov, the online repository of federal government data.
It is obvious that bias and self-interest affect most everything, including data collection, data analysis and data presentation. Although we often like to think of data as being objective, the reality is that a lot of data is affected by the perspective of those who collect, create or select it. Cherry-picking data is a favorite technique of political campaigns, for example.
This phenomena and a failure to understand its influence probably affected the selection of some “high-value” datasets required to comply with the Open Government Directive. If you examine those datasets posted on Data.gov, you ask yourself “Who selected these and how the heck did they consider them high-value?” The most egregious example is the Interior Department’s considering the population counts of wild horses and burros to be a high-value dataset!
The lesson here is that perspective affects the selection process because you must answer the question, “of high-value to whom?” in order to be able to provide a sensible answer. Since Data.gov is a national initiative, it is highly unlikely that the majority of citizens would find the population counts of wild burros useful. The more likely explanation for its appearance is that it was an easy dataset to provide, rather than a high-value one.
The second lesson here involves how the controversy over the estimates of the Gulf spill grew into an indictment of transparency. Between May 15 and May 21, the White House, Environmental Protection Agency, and the Homeland Security and Interior departments all called for greater transparency from BP. What we saw in those six days was, first, an erosion of trust and, second, another – oft-repeated – lesson that “selective transparency” is an oxymoron. The key lesson here for federal agencies is that transparency is an all-or-nothing proposition; it is not something you can wade into or try out. If you want to properly wear the badge of transparency and earn the admiration and trust of your constituents, you have to be all-in.
The Deepwater Horizon Unified Command Web site lists 15 organizations involved in the response. There has been a lot of criticism about this coordinated public/private response, on issues of response time, preparedness and leadership. Although the presidential commission on the spill will sort out the details, there is a clear connection between information sharing and rapid coordination between members in such loose-knit teams. And given other recent events, such as the attempted bomber on a Christmas Day flight, it is clear that the government’s work on information sharing is far from complete.
On that note, it is encouraging that Kshemendra Paul was recently selected as the new program manager for the federal government’s Information Sharing Environment
. I have worked with Paul in the information-sharing trenches, and he brings a great deal of experience and leadership to this tough set of problems.