How will you manage big (and bigger) data in 2015?

How will you manage big (and bigger) data in 2015?

As the stream of data hitting government agencies grows, the importance of managing it will expand as well. And it is not just the volume of data that’s growing. The variety of data sources are proliferating as video and sensor data from the Internet of Things makes its way into the government data centers and enterprise networks. In fact, market researcher IDC forecasts that local governments competing to build smart cities by 2018 will drive over 25 percent of government spend to tap the value of IoT-based programs and applications.

And while government open data programs have proliferated in recent years, the real challenge is making that data readily available to a growing number of users.

“Open data is a nice thing, but most open data is not consumable by people who are not technical or do not have a technical infrastructure at their beck and call,” said Keith Donley, enterprise data manager at the Virginia Department of Transportation.

The upshot: government IT departments will increasingly turn to data warehouse augmentation tools and tactics in 2015 to help manage their big data challenges.

Sandboxes for experimentation

A key technique for handling the data surge is to provide a staging area, or sandbox, in which organizations can explore new data sets before deciding whether to add them to a data warehouse.

VDOT uses business analytics software from Tableau Software Inc. to augment the agency’s enterprise data warehouse. Tableau provides a sandbox in which VDOT can combine sample data with existing data in the warehouse. This mixing of data from different sources, a process called data blending, lets VDOT determine whether the sample data is valuable enough to warrant adding to its data warehouse, Donley said.

Bill Franks, chief analytics officer of Teradata Corp., said the need to make data available for experimentation is a key force behind the data warehouse augmentation trend. For one thing, data warehouses are typically used for data that is well understood.

But the value of the voluminous data generated via temperature and humidity sensors, may not be immediately known. “There is simply so much more data available and a lot of that data is of unknown use and quality at the time it may be collected,” said Franks.

Another way to augment a data warehouse is to offload big data processing to another technology platform such as a Hadoop cluster. However, Taha Kass-Hout, chief health informatics officer at the Food and Drug Administration, suggested that Hadoop may not fit every application.

“As FDA acquires big data in the cloud, appropriate technologies will be deployed,” Kass-Hout said. “Hadoop is just one technology option and may not necessarily be appropriate for the particular data or data use.”

Increasing data availability

Once agencies determine their approach for storing high-value data sets, the next step is making it available. Traditionally, business users would need to ask the IT shop to generate a report if they wanted to tap a data warehouse. But agencies will be making more data accessible in the cloud, a move that is already underway.

One example is openFDA, which operates in a public cloud environment. The project, which debuted in June 2014, aims to “create easy access to public data,” according to the openFDA web site. Kass-Hout said openFDA uses new technologies such as Elasticsearch, Luigi, Node.js and the JavaScript Object Notation open standard to manage the data demand.

Those technologies result, “in very fast response times, even when there are very many simultaneous users,” according to Kass-Hout, adding that FDA was considering applying these technologies to internal FDA databases next year, “whether or not they are in the cloud.”

VDOT, meanwhile, uses Tableau Public, a data visualization software tool, to make data available to both internal personnel and the public.  The tool hosts an organization’s dashboards, tables, graphs and other visualizations in the cloud.

VDOT’s Traffic Engineering division, for instance, uses Tableau Public to publish data on traffic accidents in the state, Donley said. Users can apply a range of filters to analyze the data, including location, year, crash severity and weather conditions.

Mobile accessibility

The arrival of business intelligence on mobile platforms could also contribute to making big data more accessible to users in the new year.

Products are available from vendors such as MircroStrategy, which offers mobile business intelligence products including its Analytics App for iPad. In another example, startup Databox in November unveiled its Databox for Enterprise, a mobile business intelligence platform that the company said is designed for decision makers rather than data mavens.

The FDA may be one agency adopting such mobile technologies next year. According to Kass-Hout, “FDA is investigating the use of mobile hardware by field staff that would have the same access to business intelligence tools as the usual laptops issued to FDA staff.” 

About the Author

John Moore is a freelance writer based in Syracuse, N.Y.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected