Finding child victims in a haystack of forensic images
- By Paul McCloskey
- Oct 23, 2015
According to the Homeland Security Investigations agency, one in five girls and one in 10 boys in the United States will be sexually exploited before they reach adulthood. Protecting victims and catching offenders has been the focus of thousands of federal, state and local law enforcement agents.
In 2011, the HSI Cyber Crimes Center’s Child Exploitation Investigations Unit (CEIU) created the National Child Victim Identification Program to identify and rescue child victims, apprehend offenders and locate crime scenes. The program included a Victim Identification Laboratory where seized images, videos, audio and metadata are analyzed, enhanced and clarified.
Investigation of child pornography trafficking generates more data than many law enforcement agencies can process. In 2014, CEIU seized 5.2 petabytes of data, 52 percent of which involved child sexual exploitation. Unfortunately, much of it is inaccessible because investigators lack standard technologies to share data across the law enforcement community.
To meet the challenge, in 2012 CEIU joined forces with officers from across the law enforcement community, including the International Centre for Missing and Exploited Children (ICMEC), to launch Project Vic and put the latest forensic tools to work sifting through evidence of victims of child pornography.
Richard Brown, technology advancement officer at ICMEC, said the primary goal of Project Vic was to get law enforcement “on the same page when it came to standardizing the way they exchanged data with each other and the services they need to access.”
Tool providers had been using proprietary methods to manage forensic data, he said. In addition, the group became aware that police officers worldwide were duplicating their efforts by examining the same images over and over.
To categorize the data, Project Vic originally relied on binary hash sets — MD5 digital signatures generated by algorithms — to identify whether seized evidence matched existing library files and datasets. The hashes were maintained in a database police could check to see which images had already been identified. Early versions of the tool were effective in scoring matches, but they tended to support a focus on offenders.
“What we’re not doing is finding new victims within that data,” said James Cole, national program manager for victim identification at the Department of Homeland Security. “Instead, we were doing the equivalent of shoving that victim in the evidence room.”
“The mantra of Project Vic is: ‘That’s not where you should be focusing your efforts,’” Cole said. “What you should be focusing your efforts on is the stuff that didn’t hit. It’s the stuff that’s new to our system because that’s where new victims will be.”
Tools of the trade
Project Vic’s leaders are working on ways to give investigators more than binary-level tools to process forensic data. Instead, they are seeking to foster a network of collaborators who can contribute open standards-based tools to help analyze new child exploitation cases.
Recently, the project adopted an open-data exchange format called OData, which allows vendors to pass data between different forensic tools more easily “instead of being in proprietary boxes,” said Cole, who calls it “one of the huge tenets of our project.”
Using OData, Project Vic also developed a protocol called the Video, Image Classification System that supports querying and exchanging hashes without the need to manipulate files directly. VICS was developed to help police agencies focus on victims and other never-before-seen materials.
In December 2014, Microsoft donated its PhotoDNA Cloud Service to Project Vic and offered it as a cloud service to other organizations through the Microsoft Azure marketplace. PhotoDNA can help identify exact copies of an image or video that might have appeared on various websites. The tool is especially useful to investigators trying to identify whether a photo taken by a mobile phone is identical to a copy of the photo generated by social media sites, for example.
“When the next person uses those hashes, it’s not only going to pick up on the exact match but it’s going to pick up on visually duplicative matches,” Brown said.
More recently, advances in imaging forensics have prompted development of tools that can perform more complex matches and help law enforcement agents pursue victim-centric strategies. That includes tools from Griffeye, formerly NetClean. The firm said in April that its Analyze Digital Investigator would incorporate Analyze Relations, a feature that will “actively help to connect the dots between images and assist in building visual maps that abstract intelligence from visual big data.”
The software identifies relationships within images by comparing multiple types of data and metadata, including what kind of camera was used to take the photos, attributes within the images, and where and when the image was taken. More than 2,500 law enforcement agencies in 30 countries use the Analyze platform, the company said.
Recently, Project Vic began exploring more complex facial recognition techniques, particularly for images that don’t show a conventional snapshot view of the subject.
“Doing facial recognition from images that are all in conformity is easy because you can count the different points on the face and actually match them,” Brown said. “What we’re looking at is more complex facial recognition, where you get a three-quarter or tilted view of the child or suspect.”
Project Vic is also evaluating technology Microsoft is working on that can gauge a person’s age. “It would be useful if an investigator can say, ‘Show me all females who are 18 or younger or show me any six-year-old,” Brown said.
Another tool, dubbed F1 Video, would help investigators identify images hidden or obscured in often hard-to-reach video formats. The technology creates a hash of offending video clips that might be a short burst of a child pornographic video appearing several minutes into another piece of video or movie. F1, donated to ICMEC by Friend Media Technology Systems, allows investigators to crop the abusive material and put it into the cloud, where it can be matched against other video categories.
Collaborators say Project Vic’s mission is to create an ecosystem of data-sharing partners to protect victims and find perpetrators of child exploitation.
Project Vic seems to be meeting both goals. By the end of 2014, partner organizations identified and rescued more than 1,030 child victims. And within a three-year span, the project helped increase criminal arrests by 67 percent and convictions by 55 percent.
Cole looks at the success in this way: “When we in law enforcement child exploitation cases focus on offenders, we will miss victims. But if we focus on finding victims, we will not miss the offenders.”
Paul McCloskey is senior editor of GCN. A former editor-in-chief of both GCN and FCW, McCloskey was part of Federal Computer Week's founding editorial staff.