The chase for big data skills

The hunt will soon be on for professionals who can act on the fly.

It’s a truism in just about every important IT sector the government is involved in: No matter the sophistication of the technology or the value of the goal it’s supposed to address, if you don’t have the people who know how to use it, you’re sunk. That’s been true for cloud computing, virtualization and cybersecurity. The next area of concern looks like it will be big data.

Right now there’s no problem because in the areas in which big data is most actively pursued in government — mainly at the research and intelligence agencies — there are plenty of people with the expertise to manage big data projects. Look forward a few years, however, when big data know-how will be in demand across the government enterprise, and the situation is not so certain.

It’s likely that the government will have a big problem on its hands. With the demand for big data professionals also growing in the private sector and few places now where the skills are being taught, a major gap in supply is being predicted.

In 2011, McKinsey Global Institute, the research arm of management consultant McKinsey and Co., forecast a possible shortfall of 140,000 to 190,000 people with deep analytical skills, as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

On the bright side, said Peter Doolan, group vice president and chief technologist for Oracle Public Sector, agencies might be pleasantly surprised. The code for at least some of the data store and data file technologies that are under the big data hood are almost the same as the 25-year-old, pre-relational database times, he said. And that could fit well with a “generation of disenfranchised IT developers who are coming back into the fold.”

However, he said, “there is a requirement for a particular skill set for how you tease the information out of the data flow itself, and the math required for that does need a fairly highbrow skill set.”

Agencies could go a fair distance with the skills they already have, but then they’ll need to make sure those skills advance, he said.

Agencies should take a cue from the experience they’ve had with other IT processes, said Matthew Martin, a solutions architect at Merlin International. Like the cloud, big data is a cross-functional discipline, and agencies will find they already have some of the skill sets they need, though not all.

If they lag anywhere it will probably be on the application and data management sides of the equation and the areas in which the two convergence.

“But I don’t think it will be any different for big data than it will for any other IT area, except that the adoption of open source will help steer the requirements for big data into the development of tools upfront, which will lower the need for any customization,” he said. “And that will cut back on at least some of the skills requirements.”

Nevertheless, he thinks it will be a good idea for agencies to get ahead of the big data trends to help narrow the gap between the available talent pool and the eventual skills demand.

“Agencies will need a lot of cross-fertilization to make sure those [skills] are available when they need them,” he said.

The fear of shortages is even apparent in the research and development sector, where the supply is currently strong.

The National Science Foundation recently said it would establish a new track in its Cyberinfrastructure Framework for 21st Century Science and Engineering program that would aim for the “education and support of a new generation of researchers able to address fundamental big data challenges concerning core techniques and technologies, problems, and cyber infrastructure across disciplines.”

However, warned Dale Wickizer, NetApp Inc.’s chief technology officer for the U.S. Public Sector, it’s not just a case of having the people for big data and big analytics because, in the future, the whole mechanism for getting answers out of data will change.

Typically, you now have questions that need answers and you build a query around them, and then the database is built around the query and the same questions get asked of the database thousands of times a day, he said. But when the behavior changes around the business and the mission, when people don’t have the luxury of knowing week in, week out what the questions will be, that will be a brand-new situation.

“These newer databases will lend themselves more to building things on the fly, and as the questions change, you will need a different schema each time to get the information out of the data,” he said. “That will need a completely different skill set.”

About this Report

This report was commissioned by the Content Solutions unit, an independent editorial arm of 1105 Government Information Group. Specific topics are chosen in response to interest from the vendor community; however, sponsors are not guaranteed content contribution or review of content before publication. For more information about 1105 Government Information Group Content Solutions, please email us at [email protected]