What

What's holding back Hadoop?

Hadoop -- the open-source, distributed programming framework that relies on parallel processing to store and analyze both structured and unstructured data -- has been the talk of big data for several years now.   And while a recent survey of IT, business intelligence and data warehousing leaders found that 60 percent will Hadoop in production by 2016, deployment remains a daunting task.

TDWI -- which, like GCN, is owned by 1105 Media -- polled data management professionals in both the public and private sector, who reported that staff expertise and the lack of a clear business case topped their list of barriers to implementation:

Barriers to implementation Respondents who checked each category
Inadequate skills or difficulty of finding skilled staff
 
  42%
Lack of compelling business case
 
  31%
Lack of business sponsorship
 
  29%
Lack of data governance
 
  29%
Security for Hadoop data
 
  29%
Lack of metadata management
 
  28%
Excessive hand coding required of Hadoop
 
  27%
Cost of staffing Hadoop admin/development
 
  25%
Cost of implementing a new technology
 
  22%
Difficulty of architecting big data analytic system
 
  22%
Immature support for ANSI-standard SQL
 
  19%
Interoperability with existing systems or tools
 
  19%
Software tools are few and immature
 
  19%
Enterprise-class manageability
 
  17%
Not enough information on how to get started
 
  16%
Slow pace of hand-coded development
 
  16%
Cannot make big data usable for end users
 
  13%
Handling data in real time
 
  13%
Existing user-defined DW architecture
 
  12%
Poor quality of Hadoop data
 
  11%
Software tools need higher-level language support
 
  10%
Hadoop's high operational expenses
 
  9%
Enterprise-class availability
 
  9%
Other
 
  2%

The respondents did, however, see a wide range of uses to justify the deployment efforts, including:

HDFS applications Respondents who checked each category
Complementary extension of a data warehouse
 
  46%
Data exploration and discovery
 
  46%
Data staging for data warehousing and data integration
 
  39%
Data lake
 
  36%
Queryable archive for non-traditional data
 
  36%
Computational platform and sandbox for analytics
 
  33%
Enterprise data hub (for both new and traditional data)
 
  28%
Business intelligence (reporting, dashboards)
 
  27%
Queryable archive for traditional enterprise data
 
  19%
Operational data store (ODS)
 
  17%
Repository for content, records management
 
  17%
Operational application support (apps on Hadoop data)
 
  11%
Don't know
 
  3%
Other
 
  1%

And just 6 percent said Hadoop deployments were not in their organization's plans at all:

When do you expect to have HDFS in production?

- 2012 - 2014


The full report, which also includes best practices and implementation trends, is available here.

About the Authors

Troy K. Schneider is editor-in-chief of FCW and GCN.

Prior to joining 1105 Media in 2012, Schneider was the New America Foundation’s Director of Media & Technology, and before that was Managing Director for Electronic Publishing at the Atlantic Media Company. The founding editor of NationalJournal.com, Schneider also helped launch the political site PoliticsNow.com in the mid-1990s, and worked on the earliest online efforts of the Los Angeles Times and Newsday. He began his career in print journalism, and has written for a wide range of publications, including The New York Times, WashingtonPost.com, Slate, Politico, National Journal, Governing, and many of the other titles listed above.

Schneider is a graduate of Indiana University, where his emphases were journalism, business and religious studies.

Click here for previous articles by Schneider, or connect with him on Twitter: @troyschneider.


Jonathan Lutton is an FCW editorial fellow. Connect with him at jlutton@fcw.com

inside gcn

  • When cybersecurity capabilities are paid for, but untapped

Reader Comments

Wed, May 20, 2015 JustACommenter

"Inadequate skills or difficulty of finding skilled staff" Who on earth filled out the survey questions. TRAIN the staff! Is that impossible? Most programmers can read documentation and use API's. If you are not willing to train them, then don't blame the employees. If you can't find Hadoop experts..OK Hadoop is relatively NEW...is it insane to assume that you will send people to TRAINING or give them time and resources to learn the technology by reading the documentation?

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group