COMMENTARY

### Connecting the dots isn't what it used to be

Flood of digital information increases the need for accuracy -- including knowing which data to leave out

Remember when we used to ride around in our cars and listen to AM radio? Maybe you’re not quite old enough to remember, but there was a time when AM radio was all we had – and that was fine. There also used to be only a handful of television channels, which we had to get up out of our chairs to change. That was fine, too.

I don’t remember longing for a wider variety of music on the radio, or more channels to watch on TV. We had what we had, and it was all fine – it was all “good enough.”

There was also a time when the level of accuracy that our intelligence and law-enforcement systems offered was “fine.” We connected the dots well enough to eliminate the greatest threats.

Not any more.

Today, there is an intense push for accuracy in our data and, particularly, in our ability to accurately “connect the dots.” Why now? What’s changed? What’s pushing the accuracy button more than it’s been pushed before?

Turn off the radio, put down the remote, and I’ll explain.

What is accuracy?

I’m a mathematician. When I think of accuracy I think of numbers and percentages, of false-negatives and false-positives. But for law enforcement or intelligence officials, accuracy means tracking down and mitigating a potential risk before it happens.

Both perspectives are critical in understanding what accuracy is and how to improve your results.

Mathematically, accuracy is a pair of numbers. Accuracy compares the number of times you “miss” (present a false negative) and the number of times you incorrectly “hit” (present a false positive). Accuracy measures how well your process makes a decision – how well it can find a “true-positive” result amid the false negatives and false positives.

When you hear a phrase like, “our system is 95 percent accurate,” it usually refers to the false-negative rate – or the connections it missed. To gauge the true accuracy of that system, you also need to know the false-positive rate. If the system floods you with false positives, and touts a 95 percent accuracy rate (focusing on the things it missed), that’s not going to get you very far. You’re going to be spending all your time chasing false threats.

From an intelligence perspective, accuracy is just as much about keeping data apart as it is putting it together. If there were a security threat in a particular airport at a particular time, a less-than-accurate system might flag every person who was in the airport at that time as a suspect. A highly accurate system would be able to parse through the vast numbers of individuals in the airport at that time, “connect the dots” between those people and other data points within other records, and present a highly targeted list of suspects.

Accuracy is being able to make the best use of all the information you have – putting data together where necessary, and keeping it apart where necessary, to create a highly targeted list of “true-positive” results.

Why now?

The description of accuracy hasn’t changed since the time of AM-only radio, but the need has changed – because our circumstances have changed.

Two primary factors are driving the push for greater accuracy:

• More information. Today’s law enforcement officials have to deal with millions of terabytes more data than they have ever had to work with in the past. Not only are there more records about more people – a simple function of our digital times – there is significantly more travel (international travel in particular), more people traveling on visas, more types of communication and a wider variety of threats.
• More fragmentation. As the amount of information grows, so do the different locations and different types of information. From local police records to state databases to federal watch lists – and all the different types of entities (people, phone numbers, weapons) that reside in each – intelligence and law enforcement officials are faced with a daunting task of connecting dots between and among all this information. Their job is akin to finding a needle in a haystack.

Adding to the challenge, the risks are greater for missing important connections – or, not connecting the dots between data that already exists. Being marginally accurate is no longer good enough.

Finding the right technology

Finding the right solution is all about understanding what accuracy is and what you should expect from a highly accurate system. The most effective technology will let you look at the right 10,000 things, not just the top 10,000 things. The right technology will help you reduce the noise, particularly the signal-to-noise ratio.

The most effective technology also will have a level of intelligence. Technology that produces the most accurate results will be able to account for errors and other misspellings – or purposeful deception, a practice common among persons of interest trying to avoid detection. That technology also will have the ability to recognize and account for cultural anomalies and unique factors in different languages.

The technology you use also has to be adaptable; it has to be able to allow you to introduce new information and new agents, as well as securely exchange information between – and allow appropriate access from – other reliable sources.

Finally, your technology has to be flexible, so you can readjust your priorities according to changing threats, threat levels and resources.

Accuracy involves many factors. And, it’s a moving target. Because of the evolving nature of threats, we must keep working at this – we must keep pushing the envelope.

“Good enough” just doesn’t work any more. As soon as we can do more, we have to do more. And we have to keep pushing for greater accuracy and a greater ability to connect the dots. Our national security is at stake.

Scott Schumacher, Ph.D., a government security and technology expert, has been involved for more than 20 years in research, development, testing and implementation of complex data analysis solutions, including work commissioned by the Defense Department. He is a member of the Institute of Mathematical Statistics and the American Statistical Association.

• ### Records management is about to get harder

New collaboration technologies ramped up in the wake of the pandemic have introduced some new challenges.

• ### Phish Scale: Weighing the threat from email scammers

The National Institute of Standards and Technology’s Phish Scale quantifies characteristics of phishing emails that are likely to trick users.