What is different about big data is the ability of emerging cloud environments to process unlimited amounts of data.
While the hype cycle for Big Data has officially jumped the shark, as evidenced by a recent Dilbert cartoon strip, most government managers are still confused about what this “new new” thing is; how it’s different from what they are already doing; and how they would begin to apply it in a practical manner. Let’s pull back the cover on big data and discover some of its secrets.
First off, what is different about big data? We have had statistical analysis of data for a long time, so that can’t be it. We also have had cool visualizations of data for a long time, so that can’t be it either. So, what is different about big data and is it just a small side project for your IT shop or a sea change for your IT future?
Simply stated, what is new and different about big data is the ability of the emerging cloud environments to process unlimited amounts of data. Yes Virginia, the cloud changes everything – including your data. So, let’s now change the question around – what do you want big data to mean? If you want it to mean the secure centralization of your data, better control of your data as an asset, and flexible sharing of your data across applications then you are getting close to unlocking the secrets to the real potential of Big Data and an understanding of the gap between the current offerings and such a vision.
Facebook and Google, hooks and anchors
Let’s move from promise to reality and examine what two of the biggest big data companies are doing in this regard.
Facebook, which handles the data and metadata of more than 900 million users, is driven by a standard data model called the social graph and its extension to the Web via an open graph. All of Facebook’s features hook into these standardized models. Google, which indexes more than 1 trillion Web pages, has shifted its search away from keywords and towards a knowledge graph. Google introduced the knowledge graph to focus on “things, not strings”. As Google ties more of its products together you will see this continued trend to shift toward standard data models as an anchor and then attach “hooks” to plug in the supporting applications around it.
And now, I hope you are seeing the secrets of big data coming to light – anchors and hooks first, then visualization on top: anchors, hooks and visualizations are the secrets to leveraging big data for the enterprise.
This sequence of design elements (anchors, then hooks, then visualization) cannot be ignored if you are trying to build a strategy for the long haul (which is especially relevant to government organizations). Anchors are the governing, usage-based metamodels and data models that represent the core data representation of your organization’s mission focus; for example, the Environmental Protection Agency guards the environment while the Transportation Security Administration screens passengers at airports. These missions can have a cohesive, governing set of inter-related models that represent that reality and how your mission affects or manages that reality. That model must NOT be application specific.
Working through the design of models around all your core organizational user scenarios is both feasible and critical to a lasting, robust information-driven IT architecture. If you have not done so already, the great cloud migration is your opportunity to do it right. Hooks are your interfaces and services that manage and manipulate are the core anchors. As I have said before in regard to the cloud, if you have begun your service-oriented architecture (SOA) you are ahead of the game. Visualizations, especially including dashboards, present the models to business users along strict usage-based scenarios. I will be speaking about all this at the Big Data Conference in Washington, D.C., on Sept. 18.
NEXT STORY: How to use the cloud as a developer sandbox