New tools help bring Hadoop into the enterprise
As government agencies tap into the power of Apache Hadoop to analyze large volumes of data, developers are jumping on the big data bandwagon by offering technology to integrate the open-source programming framework with enterprise platforms as well as to more efficiently manage Hadoop-based infrastructures.
Several companies already have released technology to help agencies and businesses share and distribute big data analytics across different enterprise platforms as well as in cloud infrastructures. Attunity's RepliWeb lets data analysts quickly replicate data files to, from and between Hadoop platforms. Splunk, in turn, has an application that lets users send events from its enterprise software to Hadoop for long-term archival and data-science batch analytics.
Moreover, the growing interest in Hadoop will ensure that the open-source framework is more tightly integrated with analytics tools. By 2015, 65 percent of packaged applications with advanced analytics will come embedded with Hadoop, according to a report by market researcher Gartner.
A handful of companies have recently joined the fray, unveiling technology to further simplify Hadoop management and performance while improving the framework’s security.
DataDirect Networks' pre-configured hScaler appliance is designed to take the complexity out of setting up Hadoop-based analytics. It includes an integrated extract, transform and load engine, which combines three database functions into one tool, and it contains a management interface to simplify monitoring of an entire Hadoop infrastructure. It can offload data management functions to highly available storage, offering better performance during server failures, DDN officials say.
Intel's Distribution for Apache Hadoop provides silicon-based encryption so agencies and businesses can securely analyze their data sets without compromising performance, Intel officials said. It includes Intel Manager for Apache Hadoop software, which helps simplify the deployment, configuration and monitoring of Hadoop cluster servers.
Red Hat plans to contribute its Red Hat Storage Hadoop plug-in to the Apache Hadoop open community, with the aim of transforming Red Hat Storage into a fully supported, Hadoop-compatible file system for big data environments, company officials announced.
Revelytix's Loom Dataset Management for Hadoop is designed to improve the management of big data files created by Hadoop with dataset tracking and auditing and capabilities such as data lineage, which describes where datasets originate and which datasets are suitable for certain tasks, according to company officials.