It's called 'sparse data,' and it could be a big deal

You might start hearing a lot about sparse data in the near future and its impact on network infrastructure. If well-managed and planned for, sparse data can make an entire organization more efficient. But if left to grow and populate on its own, it could easily overwhelm government servers with a flood of information, resulting in a kind of death by a thousand small cuts.

Sparse data is a term used to describe information coming from sensors or other non-IT devices. It’s also sometimes called thin data, though sparse is probably the better term.

This isn’t someone doing a database query or getting real information from a server, like a report or budget numbers. It’s a sensor recording the temperature and humidity levels, or how often something is used. When the sensor reports that data, it’s really just a blip of information within the overall structure, hence the name. Sparse data almost always goes one way, from the sensor to the network. Although it’s just a bit of data now, in the future, there may be many more of these devices, and in unique areas.

Jerry Gentry, vice president, IT program management at Nemertes Research, sees a future where everything from the coffee maker to your office chair is implanted with sensors.

In his vision, devices such as doors report how often they are used, chairs report when they are moved and the coffee machine reports how many cups it produces each day. That may seem like a bunch of junk data, or perhaps a world of Big Brother gone mad, but if you compile all that data in a reasonable way, it can tell you things about your office, such as which routes will be taken in an emergency, or how much harder your HVAC has to work if more desks are placed in a certain area.

Government agencies, like other operations, could use this kind of data to more efficiently manage buildings, but compiling sparse data is already being used in other ways, such as monitoring traffic on bridges and roadways, or in a variety of weather monitors or tsunami prediction systems. Sensors are increasingly being deployed by agencies, which means sparse data likely will become a term you'll hear more often.

In an even more futuristic outlook, Gentry envisions implanting sensors into seeds, which would then grow with a plant, and signal farmers when the crop is in distress or ready to be harvested. I guess we would eventually eat them.

But we are a bit far from that right now. Still, some sparse data is streaming into data centers today, and it would be in our best interest to figure out how to adequately administer it while it’s still manageable. One or two bee stings can be annoying. Ten thousand will kill you. And we can already hear the buzzing coming our way.

About the Author

John Breeden II is a freelance technology writer for GCN.

Reader Comments

Wed, Sep 19, 2012

This definition of sparse seems unusual to me in the statistics field, or at least not specific enough. Typically sparse data means there are many gaps or empty cells/slots in the data being recorded. In the case of sensors, sending a signal only when the state changes (e.g. a door is opened) is sparse because you get new data intermittently. If you record wind speed, the values change constantly and you get a dense dataset (i.e. new values every second). So not all sensor data is sparse. Or has "sparse" jumped out of its mathematical definition and taken on a new meaning in the IT world?

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above