CD library sorts evidence

CD library sorts evidence

Mark Weil, a DOD computer forensics examiner, expects a NIST CD to save him time searching computers seized as evidence.

Mark Weil's fingers have been ink-smeared for nearly four years from sorting through case files on everything from child exploitation and prostitution to computer security and e-mail threats.

Weil, an examiner at the Defense Department's computer forensics lab, anxiously awaits the arrival this month of the first CD-ROM version of the National Software Reference Library. It profiles 652,000 common pieces of software and matches them against seized computer files to separate what's innocent from what might be criminal evidence.

The National Institute of Standards and Technology compiled the database on CD-ROM for federal, state and local law enforcement users.

The 125M database categorizes up to 100,000 file types in seconds so investigators do not have to open every one to look for evidence.

'If you can eliminate 20,000 to 30,000, you reduce your workload by 20 to 30 percent, which is enormous,' Weil said.

For work on time-sensitive cases, eliminating a hunt through thousands of files could be a significant time savings, said Barbara Guttman, a researcher in NIST's information technology lab who helped develop the database.

'Sometimes you're do-ing it for a case where people are going to die,' Guttman said.

Weil said he remembers having to open hundreds of files on a disk taken from a crime scene to see if any files were germane to the case.

'The database is how we identify the irrelevant files,' Guttman said. Almost any file that matches up exactly with the NSRL hashing algorithms is not going to be evidence. Corrupted files that do not match the profiles, however, raise red flags.

'The whole idea of this is to reduce the examiner's time,' said Susan Ballou, program manager of forensic science projects at NIST's Office of Law Enforcement Standards in Gaithersburg, Md.

Examiners 'know what certain files look like and can just ignore them,' Ballou said.

NSRL runs on any computer with standard database management software. It uses three algorithms: Secure Hash Algorithm 1, which computes a condensed representation of a message or a data file; two other algorithms called MD 4 and 5; and the CRC-32 checksum algorithm.

Without the CD, investigators like Weil must spend hours or days comparing the seized file signatures to common templates for shrink-wrapped software.

'It takes time to calculate the hash on each file,' Weil said. The NSRL software calculates the hash for him.

If a particular hash from a file on a crime scene disk does not match any NSRL hash, he looks at it.

'Either we didn't get that legal software product on the NSRL, but it's still a legal file, or something on the disk has been altered,' Ballou said.

Quarterly updates

Investigators can save lots of time by ignoring, for example, Microsoft Windows 2000 or Office 2000 files, which have well-known hashes, Weil said.

To deal with new types of software, NSRL will come out in new versions each quarter. 'Each subsequent version should save the investigators time,' Guttman said.

More often than not, investigators have to copy drives and disks found at a crime scene regardless of the kind of crime, Guttman said.

'Sometimes it isn't a computer crime but the records are in the computer,' she said. 'If you're running an illegal gambling operation, you might store your books on the computer, and law enforcement wants to find those files.'

The FBI and the Defense and Treasury Departments have paid their $90 annual subscriptions for the NSRL disk. The FBI will use NSRL in its Automated Computer Examination System suite.

Guttman said she hopes the database will be useful for intellectual property crimes, too. 'Most everything is on computers now,' she said.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected