Improved image analysis tools speed exploited children cases
- By William Jackson
- Aug 27, 2014
Digital devices have provided law enforcement agencies investigating child abuse and exploitation with an embarrassment of riches. The devices can hold thousands of images that can be used as evidence and as clues to help identify and find missing children. But the sheer volume of data being reviewed can slow an investigation to a crawl.
Adding to the frustration, many of the images investigated already are known to investigators; they provide little new information and delay discovery of new evidence.
“Everybody is looking at the same things over and over again,” said Rich Brown, law enforcement liaison and technology advancement officer at the International Center for Missing and Exploited Children (ICMEC). “We felt we could really make a difference in the amount of time it takes and agency to go through millions of pieces of child pornography.”
ICMEC developed Project Vic to promote cooperation and sharing among agencies through development of open standards technology. It hosts a database of digital hashes for several million images of child porn, using a variety of hashing algorithms, including MD5, SHA1 and Microsoft’s PhotoDNA. By helping investigators quickly identify known images, it lets them focus resources on unknown images that can lead them to new victims.
This is part of a move away from “seize and prosecute” tactics focusing primarily on prosecution for possession of child pornography and toward a focus on helping exploited children. “What we’re trying to do is go to a more victim-centric approach,” Brown said.
By aggregating data and using technology to evaluate it, Project Vic aims to avoid duplication of effort and stretch tight law enforcement resources. There now are about 50 departments participating in the project. Brown said the goal is to have all of the nation’s 62 Internet Crimes Against Children Task Forces participating within the next year.
The Project Vic image hash set is a good investigative reference, but hashing and comparing images is still time consuming. So Basis Technology, which maintains the Autopsy open source digital forensics platform, developed a new advanced image analysis and categorization module to more quickly analyze and prioritize large sets of images. It uses algorithms to create a hash, or message digest, of files being examined, which can then be compared with the Project Vic collection to find matches.
“There certainly are more powerful tools out there,” said Brian Carrier, VP of digital forensics at Basis. But the Autopsy analyzer’s ability to abstract data and prioritize images quickly can help investigators cut through the 85 percent of old material in a cache and focus on the 15 percent of new evidence.
The Autopsy platform supports a variety of open source modules for digital forensics investigations. The image analysis module was developed last year under a contract with the Homeland Security Department’s Science and Technology Division to create tools for law enforcement. Agencies want open source tools because they are easy to use and cheap, Carrier said. The goal of the DHS program was to develop new capabilities rather than merely develop free versions of existing products.
“We reached out to a lot of law enforcement people to find out what their greatest pain points were,” Carrier said. “One of their common requirements was for dealing with the large amounts of images.”
Basis spent about five months developing the new Autopsy interface, which was released for beta testing this summer. An early version of it was released at the Crimes Against Children Conference in Dallas on Aug. 11, and the finished version will be included in a new release of Autopsy.
The strength of Autopsy’s image analysis is its ability to return significant results early, while the analysis is going on, rather than dump all of the results together at the end of the process.
“There are some challenges with the idea of providing results as soon as possible,” Carrier said.
The analyzer prioritizes material to be examined on a device, looking first at user content before other files. It also looks at metadata, using Exif (Exchangeable image file format) standard data that is applied to images by digital cameras, scanners and other image-handling technology. By categorizing data on the device, the analyzer selects files most likely to provide new information. When a suspect file is identified that does not match a hash in the Project Vic set, that information is displayed immediately so investigators do not have to wait for the rest of the job to be completed to follow it up.
The Autopsy analyzer interface is just one tool in the Project Vic toolkit that can be used to assess evidence and further an investigation.
“It reduces the workload of the investigators and lets them focus on new and unknown images,” Brown said.