Scott Schumacher

ANOTHER VIEW—Guest commentary

Entity resolution's growing role in security efforts

FBI among agencies using the technology for search and analysis across data sets

Identifying and tracking down persons of interest is one of our greatest national challenges.

Editor's Note

This is the first of a two-part commentary. Part two will be published Oct. 7.

As was demonstrated by both the Oklahoma City bombing and Sept. 11, 2001, terrorist attacks, national security threats can come from anywhere. The challenge and complexity of the problem has gotten even deeper as at least one naturalized U.S. citizen has been directly linked to terrorist activity overseas.

How can U.S. agencies successfully guard against current and emerging threats?

The Homeland Security, Defense and Justice departments, law enforcement agencies and the intelligence community are under increasing pressure to share information in order to more effectively identify and unravel threats before they happen. Yet, how can agencies reconcile the massive amount of related data that exists in bits and pieces across hundreds or thousands of disparate databases? How can agencies “connect the dots” between U.S. citizens and international terrorists with sufficient confidence to identify a potential threat?

Data-mining technologies have been used in the past to try and solve this challenge. However, these technologies require data to be aggregated in order to be mined. They also require that data be complete and correct, which is generally not the case with threat-related information that is often deliberately falsified or incomplete in order to avoid detection.

As a result, data mining alone is not sufficient to satisfy mission requirements.

The intelligence and law enforcement communities have recently begun using a type of technology designed to help connect the dots across data sets – relating to persons of interest, places of residence, vehicles, weapons and so forth. Called entity resolution, the technology enables the integration of records from many different sources and resolves these records into complete “entities” with a high degree of confidence. Once entities have been resolved, the associations within and among the entities can be identified and acted on.

The end result: Technology can be heavily leveraged to attack the signal-to-noise ratio problem faced by analysts and investigators and help them more easily connect the dots. Resolving entities and associated relationships with a high degree of confidence enables analysts and investigators to focus their efforts on the persons of interest and relationships that truly matter.

To the fore

Research firm and consultancy Gartner has been tracking the entity-resolution market for several years. “Entity resolution and analysis was previously an obscure technology that has come to the forefront as a result of world events and market forces where it is used to identify the use of false identities and networks of individuals who are attempting to hide their relationships to each other,” stated Gartner in “Hype Cycle for Master Data Management,” a report released in June.

In other words, when persons of interest are leaving trails of misleading information, agencies must be able to recognize conflicting data and act accordingly. That’s precisely what entity resolution does.

Today, most data exists within discrete stovepipes, associated with a particular application, system or architecture. For example, the details within the local arrest record of a U.S. citizen, including information associated with that person and the circumstances surrounding the arrest, are generally viewed and analyzed only within the context of the local law enforcement records management system. Similarly, details associated with suspects on a national watch list are generally viewed within the context of that application, run by a different agency. The various levels of law enforcement – from local and county level agencies to state sheriffs and federal intelligence agencies – have not traditionally shared data because of both political and technology roadblocks.

Entity resolution helps agencies transition from a system-centric to an entity-centric enterprise by decoupling entity data from contributing source systems. Once decoupled, relationships within and among the entities can then be identified and managed.

The resolved data set can then be made available to all users with appropriate access rights, through the standard applications already in place. Critically important for security and privacy, the data from the source systems can remain in place and under the control of the contributing application and/or organization.

When coupled with entity extraction technologies, entity data from unstructured data sets can also be integrated and can help connect other pieces of the puzzle as well.

By identifying and managing relationships between persons of interest and other individuals or objects, entity resolution delivers a more comprehensive view of people, places or things and their activity. By significantly mitigating the signal-to-noise challenge faced by analysts, they can then be much more proactive in identifying either the hot spots or patterns that would serve to thwart an attack – or contribute to solving or preventing a crime.

Marks of success

Implementations of entity-resolution technology have been widely successful. One of the largest implementations to date is the FBI’s National Data Exchange (N-DEx) program, a criminal justice information sharing system that will provide nationwide connectivity to disparate local, state, tribal and federal systems for the exchange of information.

N-DEx will provide law enforcement agencies with a powerful new investigative tool to search, link, analyze and share information (for example, incident and case reports) on a national basis to a degree never before possible. N-DEx will primarily benefit local law enforcement agencies in their role as the first line of defense against crime and terrorism.

The FBI system leverages entity resolution to associate many different types of data across the hundreds of millions of records contributed by the participating agencies and resolves this disparate data into entities. Once resolved, the relationships within and among the entities are then identified and acted on. This is initially done within the context of incident reports and investigations but also enables identification of potential relationships across these.

This type of information sharing is available to participating law enforcement agencies of all levels, from local to federal.

It is important to note that entity-resolution technology works with the information already in place – information already gathered by local, state and federal authorities. There are nearly 18,000 state and local jurisdictions in the United States. Each of these jurisdictions has evolved its information systems according to its own needs and concerns; data models and formats vary widely.

Entity resolution provides a means for participating agencies to retain the management and governance of their data while contributing to the benefit of all as a collective information sharing exchange.

Entity resolution enables contributing agencies to connect the dots between relationships in information, between systems and between evidence within investigations. With this information, agencies can then identify terrorist and criminal activity that was previously undetectable and act decisively to prevent or investigate incidents.

Local, state and federal authorities already maintain a wealth of valuable information – particularly regarding national security and threats against it. The goal is to enable agencies to share information while protecting the privacy and security of that data; to help agencies “connect the dots” so they can effectively and efficiently identify and track down persons of interest.

The ultimate goal is to continue to ensure the safety of our citizens and country. Entity resolution is a fundamental piece of securing our safety.

Reader Comments

Fri, Nov 13, 2009 Graham Charters

As a Senior Consultant at a Data Governance specialist, I found this article fascinating. At Evaxyx, we believe that information is at the heart of any modern enterprise, and that it must be used for business advantage. We always begin by constructing a model of the data used in an enterprise. Our models promote engagement over formality. Before any discussions on data can begin, it is essential that a common basis of understanding is achieved. There are always existing perspectives to accommodate. We do this by working collaboratively and intensely with our customers.

Sun, Oct 18, 2009 Nigel DeFreitas Jersey City, NJ

Great article Scott! I've looked at a few of these systems, but it seems as though IBM's Entity Analytics Solution product (formerly SRD Systems, funded by In-Q-Tel, developed in large part by Jeff Jonas for Casinos) is by far the best of class in this space. A conversation with a Gartner analyst verified this. The problem I see is the most vendors that claim to be in this space are using glorified name to key based searching systems. We have SSA/Trillium/Name Search 3, and can do this ourselves. In your opinion, what set's Initiate Systems apart?

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above