DARPA makes strides in searching the ‘deep web’
The “deep web,” a concept more in keeping with fiction than science, gained widespread attention after the FBI shut down Silk Road, the Internet’s premier international one-stop shop for all things contraband.
A so-called “anonymous marketplace,” the site ran on Tor, free software that makes it difficult to trace Internet activity by sending traffic through a worldwide volunteer network of thousands of relays.
The deep web roamed by the denizens of Silk Road makes up a majority of the internet space, according to experts, who assert that the commercial internet – the .coms, .nets, .govs, .orgs and .mils typically accessed through mainstream search engines – only consists of about 5 percent of Internet traffic, according to a report from CBS News.
The other 95 percent has proven to be a cyber safe haven for all types of illicit activity, from narcotics trade to illegal weapons. Law enforcement officials have gone to great lengths to prevent such illegal activity, but if they don’t know where to look in the deep web, such marketplaces can be next to impossible to find.
Until now, that is. The Defense Advanced Research Projects Agency developed a search engine last year capable of searching the deep web.
DARPA’s goal for Memex, as the search engine is called, is to develop the next generation of search technologies and revolutionize the discovery, organization and presentation of search results and along the way, shine a light into the deep web.
“The goal is for users to be able to extend the reach of current search capabilities and quickly and thoroughly organize subsets of information based on individual interests, according to a DARPA report on Memex.
“Memex also aims to produce search results that are more immediately useful to specific domains and tasks and to improve the ability of military, government and commercial enterprises to find and organize mission-critical publically available information on the Internet.”
While DARPA intends for Memex to be used in the public market, initially it will be used by law enforcement to combat human trafficking and other illicit activity by monitoring chat rooms, online forums, advertisements, job postings and hidden services. One of the complexities of the deep web is that much illicit activity is not available long enough for search engines to “crawl” them.
As part of Memex, DARPA is working with 17 different teams of researchers from industry and universities to develop tools to give government agencies ways to access these dark reaches of the web.
In a recent success story in Scientific American, Memex was used by law enforcement officials to help locate a victim of sex trafficking.
The Memex system incorporates eight different open-source and browser-based search and analysis programs to perform data analytics.
DARPA is still holding much of the Memex technology close to its vest, but tidbits of information have trickled out since its inception.
According to the report in Scientific American, DARPA researchers have also made progress in creating tools that help analysts identify relationships among different pieces of forensic data. The software also helps investigators build data maps showing visualizations of the links in hundreds of these data associations. It can identify relationships between a single piece of data – an email address for example – and hundreds of web sites.
For instance Memex can create heat maps that illustrate where other pieces of forensic data – classified ads for example – are most heavily concentrated. The visualizations help highlight associations that might overlooked, according to the Scientific American report.
The New York District Attorney’s office said it now uses Memex in every human trafficking case it is pursuing. “Memex helps us build evidence-based prosecutions,” says Manhattan District Attorney Cyrus R. Vance, Jr. “In these complex cases prosecutors cannot rely on traumatized victims alone to testify. We need evidence to corroborate.”
Posted by Mark Pomerleau on Feb 11, 2015 at 1:43 PM