More robust analytical tools needed
The current state of information management might be summarized as, "Too much data, not enough insight."
In a survey conducted in August 2012 by the 1105 Government Information Group, 75 percent of respondents agreed that they had to do a better job of analyzing data rather than hoarding it. In addition, roughly half agreed that their current Big Data platform can't support their current tasks (see Figure 1).
With three-quarters of respondents from government agencies feeling at least somewhat overwhelmed by their ever-growing data stores, it’s not surprising they are searching for more robust analytical tools.
The survey found that federal agencies have increasing interest in Hadoop-enabled applications to support Big Data initiatives. More than six out of ten said they were familiar with the technology, and of those familiar with Hadoop, 51 percent have adopted it, and 31 percent are investigating it (see Figure 2).
Hadoop is open-source software that was specifically developed to analyze and transform both structured and unstructured data. It provides a way to do paralleling processing of all this data, as well as store and manage it. The technology is intrinsic to the search capabilities of Google and to social media sites such as Facebook and LinkedIn.
"Coming first, we’ll see a lot more prototypes based on the Hadoop framework, because it’s so easy to start,” predicts Bob Gourley, former CTO of the Defense Intelligence Agency (DIA) and founder of Crucial Point LLC, a technology research and advisory firm. “It takes a little bit of training and all the software is free. It’s easy to start a prototype, but it’s hard to field a solution because you need programming and developments. We are seeing more standalone applications that can ride on top of Hadoop, which the average worker can use."
According to the 1105 Government Information Group survey, Department of Defense (DOD) agencies are looking for more complex tools, such as complex event processing, OLAP tools and natural language processing. Federal civilian agencies, despite having fewer tools and smaller Big Data projects, are attracted to tools such as hand-coded SQL, NoSQL or non-indexed DBMS, and text mining analytics (see Figure 3).
“They want Big Data in a box,” contends Michael Daconta, former metadata program manager for the Homeland Security Department and author of "Information as Product: How to Deliver the Right Information to the Right Person at the Right Time." “That’s okay. There may be a certain amount of low hanging fruit, but you won’t get the biggest bang from a buck from a standardize tool set.”
Gourley says Big Data is at a transition point where data scientists and developers are needed to field Big Data solutions. “Most of the folks doing that now are from the IT shop,” Gourley says. “People are starting to think about what the world will look like in a few months, when we start to see a wide range of low-cost tools that can empower any user. When that happens, individuals who can’t write SQL queries will have access to Big Data tools."
The most popular of the traditional tools is advanced data visualization -- 24 percent of survey respondents are considering this technology; 27 percent have it and are looking to replace it; and 19 percent have the technology and are happy with it (see figure 4).
Eric Sweden, program director, Enterprise Architecture & Governance for the National Association of State Chief Information Officers (NASCIO), says the interest in more sophisticated tools reflects the unique nature of Big Data. “Some of it is not statistical,” he says. “Some of it is looking at a big screen with colors and concentrations.”
For example, NASA has produced a video that spools out a decade’s worth of data that demonstrates the consequences of drought by showing the changes in groundwater under the surface. Without visualization the underlying data would not be anywhere as impactful.