How to make big data more useful, reliable – and fast
- By Rutrell Yasin
- Nov 05, 2012
Government IT managers are looking for tools that make it easier to identify meaningful patterns and statistical trends in far-flung data sets. At the same time, these tools must work well with other technologies in order to help analysts make decisions in real-time.
Splunk, a firm that offers tools for collecting and analyzing machine data generated by back-end IT systems, is looking to address these concerns by bringing real-time operational intelligence to big data storage and batch processing.
Enter Splunk Enterprise, software which collects, indexes and harnesses fast moving machine data generated by organizations’ applications, servers and devices, whether they are physical, virtual or in the cloud. Splunk also troubleshoots application problems and investigates security incidents rapidly, helping organizations avoid service degradation or outages.
Splunk’s Hadoop Connect helps integrate and move data easily between Splunk Enterprise and Hadoop, software that has become a mainstay of big data analytics. Splunk lets users send events from Splunk to Hadoop for long-term archival and data science batch analytics. Conversely, data already in Hadoop can be sent to Splunk for analysis without users having to write code.
Additionally, the company unveiled The Splunk App for HadoopOps, an application that provides real-time monitoring and analysis of the health and performance of the entire Hadoop environment all from one interface. While existing Hadoop monitoring tools focus just on the Hadoop layer, the Splunk App for HadoopOps encompasses all layers of the infrastructure, including Hadoop, the network, switch, rack, operating system and database, Splunk officials said.
So how might this work in a real world situation in a government agency?
A security administrator who needs real-time visibility across the whole enterprise and every device can use Splunk to immediately detect anomalies because the software collects data as it is being generated, according to Sanjay Mehta, Splunk’s vice president of product marketing.
Federal users are looking forward to capabilities that let them solve problems faster through the integration of Splunk and Hadoop, said Bill Cull vice president of Splunk’s public sector, noting that many of Splunk’s federal users are in the intelligence community.
Both Splunk Connect and Splunk App for HadoopOps are available for free to users of Splunk Enterprise via the company’s app store. Splunk runs on all major platforms including Linux, Unix and Microsoft, Mehta said. The software is built on a distributed architecture, which allows users to share workloads across machines via parallel processing. It runs in virtualized and cloud environments and can even manage multi-tenant cloud environments, Mehta said.
"Splunk has taken a methodological approach to defining its co-existence with Hadoop," said Matt Aslett, research manager for data management and analytics with 451 Research. Splunk Hadoop Connect not only integrates with Hadoop but also interacts with it while Splunk App for HadoopOps monitors cluster resources beyond Hadoop itself. This offers users a single platform for managing and analyzing data in both environments, he said.
Rutrell Yasin is is a freelance technology writer for GCN.