Connecting classified and unclassified big data


Connecting classified and unclassified big data

Some of the data used to fight terrorism is classified, but much of it is not. That makes it difficult to cross-reference and share information while still enforcing the appropriate level of security.

Dig IT Award Finalists

The GCN Dig IT Awards celebrate discovery and innovation in government IT.

There are 36 finalists this year. Each will be profiled in the coming days, and the winners for each category will be announced at the Oct. 13 Dig IT Awards gala.

See the full list of 2016 Dig IT Award Finalists

To address that problem, the Department of Homeland Security created the DHS Data Framework, which consists of two Hadoop data lakes (or data management platforms) that can handle large volumes of information. It also uses attribute-based access controls so that designated users can see data while protecting privacy, civil rights and civil liberties.   

“There are a number of different problems that we’re looking to solve with the data framework,” said Paul Reynolds, director of the DHS Data Framework. “Many of them can’t be solved unless you bring the data into one location.”

Law enforcement officials who are investigating a terrorism suspect, for instance, need to look at classified and unclassified data. Until the data framework, there wasn’t an efficient way to do that, especially not in real time, Reynolds said.

The system takes the unclassified data and moves it up to the classified networks, “so the data itself is still unclassified, but it's sitting in a classified spot,” he said.

The classified and unclassified data sit in two separate Hadoop data lakes that use a cross-domain guard to share data in near-real time. When the framework is fully operational, DHS officials expect to have 20 to 25 databases in the lakes. Right now, four are fully operational and nine are being populated.

And they aren’t small databases. Reynolds said one of them has about 70 billion records in it.

The framework is currently only being used for counterterrorism purposes, but he said he expects that it will ultimately be used for additional mission areas.

About the Author

Matt Leonard is a former reporter for GCN.

Stay Connected

Sign up for our newsletter.

I agree to this site's Privacy Policy.