Paper documents storage

Records dilemma: When in doubt, throw it out?

Like many government social service agencies, the Illinois Department of Human Services was having trouble managing paper-based forms. But the answer DHS found started with the decision not to digitize its archives. Instead, it decided to digitize only new eligibility forms, leaving the archives to gradually disappear.

In 2010, DHS had more than 100 million pieces of paper stored in case files at local offices and warehouses throughout the state, taking up space and hurting caseworker productivity. The agency embarked on a “going-forward strategy.”

Instead of attempting to scan all of the existing printed documents, CIO Doug Kasamis spearheaded efforts to digitize and store the three types of benefit eligibility forms which collectively make up nearly 70 percent of the agency’s total form volume, according to Government Technology magazine.

But rather than print and store the new case files, DHS turned them into PDFs with metadata and stored them in the agency's IBM content management system. And because the department’s records retention policy required that forms be stored for only five years, as the DHS continues to digitize forms, the stored hard copy files will be phased out, Government Technology reported.

“Rather than trying to figure out a way to scan all that legacy paper, we’re basically getting rid of 20 percent of our problem every year over the next five years,” Kasamis said.

Most government agencies wage a daily battle against records storage and retention. Add big data hype into the mix and records managers are on the horns of a dilemma: whether to cut down on the amount of stored data or save it, in hopes that some nugget, somewhere, will justify the costs of storage and analysis.

But according to Jeff Clark, writing in the Data Center Journal, most of the data in storage is useless and identifying useful data is difficult at best. For some, he said, "the value to be gained from analyzing massive amounts of information is insufficient to justify the costs of implementing a big data analytics system."

But big data is just the latest addition to the records management headache. Even management and storage of e-mail can be problematic. When the Environmental Protection Agency wanted to move its 15-year-old e-mail system to the cloud, it discovered some mailboxes with more than a million objects in them, making moving the entire e-mail system unfeasible. Working with Lockheed Martin, the EPA, weeded out mailboxes with large attachments and archived legacy information that was older than a year.

A recent survey of federal records managers by MeriTalk found that records management issues hinder agency operations and cause budget overruns. 

The report recommends agency managers consider what records they have, who needs them, for what purpose and for how long. They should then digitize those records first and destroy older inactive records that are no longer needed for compliance or business reasons, the report recommends. A common mistake when converting paper records to an electronic format is to scan and then save everything.

About the Author

Susan Miller is executive editor at GCN.

Over a career spent in tech media, Miller has worked in editorial, print production and online, starting on the copy desk at IDG’s ComputerWorld, moving to print production for Federal Computer Week and later helping launch websites and email newsletter delivery for FCW. After a turn at Virginia’s Center for Innovative Technology, where she worked to promote technology-based economic development, she rejoined what was to become 1105 Media in 2004, eventually managing content and production for all the company's government-focused websites. Miller shifted back to editorial in 2012, when she began working with GCN.

Miller has a BA from West Chester University and an MA in English from the University of Delaware.

Connect with Susan at or @sjaymiller.

inside gcn

  • health data

    Improving the VA patient journey with data transparency

Reader Comments

Wed, Mar 27, 2013 Larry CA

The Illinois DHS decision is one of the smarter ones I've seen when it comes to dealing with growing volumes of information. Taking a focused, targeted, approach and assessing a specific "type" of information that grows exponentially and making a "day forward" decision to scan and retain new information electronically while simultaneously discarding legacy paper records on an annual basis until they are gone. Hopefully, there are sufficient metadata elements being captured on the PDFs being captured to allow access to the new files, and the repository is properly protected against improper access to protect the privacy of those whose information is contained in these files. With a five year retention, staff is likely to come and go, and their roles may change over time disallowing any need for access, so those factors must be properly controlled. Similarly, replicated copies of the data and/or backups must be properly protected. One "new" consideration that the DHS didn't have before will be ensuring access during power outages and network problems- those didn't impact the paper records. As for the balance of the comments in the article regarding Federal records issues, it's a real mixed bag. Data/records are subject to retention requirements, must be properly indexed, and remain accessible... some for extended periods of time, such as 75 years or longer. Even email has to be looked at based on CONTENT to determine retention- there is no 'retention period' for email, which is only a method of communication, NOT a record series. The MeriTalk survey doesn't go far enough to identify the issues facing some Agencies- they are not all alike.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group