Untethering internal information from paper
- By Mark Gross
- Feb 03, 2017
It’s safe to say most content developed today is created in a digital format. In the midst of this digital revolution, many agencies have left their non-electronic content behind, trapped in an unstructured or paper format. It’s not just the content citizens might want to access; it's frequently internal information that is stranded -- training documents, research materials, manuals or archives, for instance.
One organization tackling the digitization of these enormous collections of content is the Federal Library and Information Network. Under the umbrella of the Library of Congress, FEDLINK offers centralized acquisition services for the federal library and information center community, as well as to other federal agencies and offices, including those in the District of Columbia. It serves federal libraries and information centers as their purchasing, training and resource-sharing consortium and helps with digitization efforts.
The Bureau of Economic Analysis, for example, took advantage of the FEDLINK program. It had over 2 million print pages of various types of internal documents that served as valuable research materials for the agency’s staff. Before the BEA offices moved to another location, officials decided to digitize this material to make it more accessible and mitigate the problems of moving the physical documents to the new location. Additionally BEA wanted to wring more value out of its content, and boost productivity, usability and efficiency.
The months-long, complex conversion included optical character recognition processing, generation of metadata, digitization of microforms, maps, posters and computer print-outs. The project was supported by an automated production control system that regulated the workflow and allowed full tracking of each item being processed. It also gave BEA a complete, secure web-based reporting and approval platform. As a result, the agency’s content was more searchable, findable and usable -- and the effort reduced the amount of floor space dedicated to storing documents.
Likewise, like the United States Patent and Trademark Office set up a system to automate the process to convert the paperwork it receives into fully automated Extensible Markup Language files -- documents that are both human- and machine-readable.
Untethering information from print formats can generate internal efficiencies and help agencies better meet their mission of disseminating information to their constituencies.
Mark Gross is president of Data Conversion Laboratory (DCL).