Tired woman at computer in law library

XML, workflow tools ease the burden of managing regulations

Government regulation can be a burden not only for those who are being regulated but for those who write the regulations as well.

In Connecticut, the writing process is largely paper-based, from creation through formatting, printing and distribution. But because regulations are frequently revised, keeping track of official updates can be a chore, although a critical one, said Chris Drake, the deputy legal counsel to Governor Dannel Malloy.

“Some agencies didn’t know where the most recent text-edited version of a regulation was,” Drake said. “It was a quality control nightmare.”

The state’s tools and processes for maintaining accurate, timely updates were incompatible or out of sync. In government, regulations need to be publicly available, and PDF versions of the state’s regs were put online a decade ago. Yet while posting is easy, Drake said, updating is hard. Writing and editing were still being done on paper or with simple word processing, producing online files that quickly became out of date.

“We needed a system that was more transparent and accessible,” Drake said. So the governor initiated e-Regulation, an effort to bring writing and publishing of regulations into the 21st century.

A year and a half into the program, the requirements and design phases are now complete and system testing is about to begin. If all goes on schedule, a new Web portal with authoritative versions of all regulations in XML format will go live in October, supported by a back-end system for writing and tracking regs through the approval process.

The content and document management pieces of the system are being provided by Fairfax Data Systems, and conversion of the existing PDF versions of regulations to a modern format is being done by Data Conversion Laboratory (DCL).“I’m really confident the Web site will work,” Drake said in March, when testing of the systems was beginning. The portal is the most customized part of the system.

Authoring, workflow and conversion also should not present problems, said Fairfax CEO David Mancusi. “Individually, these pieces have been around forever,” he said. “Putting them together is the magic.” The pieces being assembled by Fairfax include Quark XML Author for Microsoft Word and IBM Case Manager for workflow.

DCL is also converting the existing code library for the new portal with proprietary tools to extract text from existing PDF files in modular form that will  enable collaboration on a document. A conversion tool will put documents into Darwin Information Typing Architecture (DITA) XML, and a hub will manage the different pieces of each document during the process.

Plan of action

Connecticut is a relative latecomer to the task of digitizing its regulatory processes. Once it became apparent that posting existing regs online in a static format was not sufficient, Drake began talking with other states about modernization and found many had begun the process as early as the 1990s. States including Utah, Texas, Colorado and Virginia all were ahead of Connecticut, he said. “We didn’t have anything near what they have.”

Connecticut was fortunate to have the governor’s backing for the program, because so many state agencies write regulations. “Each agency cares about its own rules, and no one agency has enough skin in the game to see it through,” Drake said. E-Regulation is a statewide program, hosted in the office of the Secretary of State.

The goal was to have the system go live by within 18 months of selecting the vendor. “We thought this was a fairly reasonable, since we weren’t implementing custom code solutions,” Drake said. “We’ve finished with design and are about two months into the construction phase.”

How it works

The first component in the solution is XML Author, a Word add-on that lets writers create XML documents without needing specialized knowledge of the eXtensible Markup Language. It looks mostly like plain Word to the user, but produces XML. Once the first draft of a document is done, it moves into the approval process, where the Case Manager workflow tool takes over. The attached document is shepherded through the process, both back and forth and through alternate routes as appropriate. The Web portal polls Case Manager throughout the process so that the current version of the document and its status are available.

Upon approval, Case Manager digitally signs the final document and calls on Quark to publish it on the portal, both in a clean codified version and in a mark-up version, which is useful to lawyers, librarians and others who track version changes in regulations.

Although the process takes care of new and newly revised regulations, there remains a large body of regulations already in use that also need go onto portal. “You need access to everything,” said Mark Gross, president and CEO of DCL. “You can’t do it over a period of 20 years; you have to have everything available.”

Connecticut’s goal is to have the existing PDF documents converted to XML when the portal goes live in October. The first step is extracting the text from the PDF so that XML tagging can be added. This is done in a modular way, so that pieces of a document can be worked on by different people at the same time, then be recombined through the hub into a coherent XML document. Once conversion is complete, it goes to a human editor for quality assurance.

“Extraction is a mostly automated process,” Gross said. “The trick is to do it in a consistent manner, which is not that easy.”

A large part of the project involves preparing and configuring the system for the customer. “We like to allow six to 10 weeks” for identifying specifications for the job, Gross said. “There are a lot of details here.”

Once the details are worked out, the process can go pretty quickly, up to 1,000 pages a week, depending on the customer’s ability to check and accept finished documents. Connecticut’s 15,000 pages are expected to take 10 to 12 weeks to convert.

Coming late to the digitization game could prove an advantage for Connecticut. “The states that jumped in early still have a lot of manual, human hours in the process,” Drake said.

Although none of the technology being used is cutting edge, much of it is only recently available. “If we had tried this six or seven years ago we might not have been able to find a solution that does this,” Drake said.

Come this fall, Connecticut hopes to have a state-of-the-art system. “If we accomplish what we set out to do, we will have the best site,” he said.

About the Author

William Jackson is a Maryland-based freelance writer.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/Shutterstock.com)

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected