Big data: How to know it when you see it

Information from sensors, video, satellites, genetic codes, social media and web tracking is contributing to a big data soup that web-connected devices are serving up. The vast amount of data now available, not to mention the storage and analytics technology that makes analysis of that data possible, is making “big data” look like the answer to every question.    

But what is big data really?

According to the National Institute of Standards and Technology, big data consists of highly extensive datasets that require a scalable architecture for efficient storage, manipulation, and analysis. Commonly known as the ‘Vs’ of big data, the characteristics of data that require these new architectures include:

Volume – the size of the dataset at rest, referring to both the data object size and number of data objects. Although big data doesn’t specify a particular data quantity, the term is often used in discussing petabytes and exabytes of data.

Velocity – the data in motion, or rate of flow, referring to both the acquisition rate and the update rate of data from real-time sensors, streaming video or financial systems.

Variety – data at rest from multiple repositories, domains or types, from unstructured text or images to highly structured databases.

Variability – the rate of change of the data from applications that generate a surge in the amount of data arriving in a given amount of time.

Veracity – the completeness and accuracy of the data sources, the provenance of the data, its integrity and its governance.

