GPO's FDsys: 1 billion items served and ready for more
- By William Jackson
- May 08, 2014
On March 26, a visitor at the Government Printing Office’s Federal Digital System (FDsys) downloaded a Federal Register notice from the Securities and Exchange Commission, which was the one billionth object retrieved through the site.
Introduced in 2009, FDsys has become a workhorse portal for public access to more than 1 million official documents, videos, audio recordings and images from all three branches of federal government. Almost half of the 1 billion retrievals came during the last 13 months. The most commonly viewed item is the Patient Protection and Affordable Care Act, with more than 14 million views since it became law in 2010. The system also hosts historical documents, such as the 1964 Warren Commission Report on the assassination of President Kennedy, which has been viewed more than 20,000 times.
FDsys was developed under a disciplined program management process using in-house expertise and infrastructure, designed for high availability and to support high demand volumes. The resulting system is a platform for managing government resources in a variety of formats.
“It’s more complex than just a Web server,” said FDsys manager Lisa LaPlant. Official documents are digitally signed, assuring the user that the documents are genuine and content has not been altered, and the system provides content management and a preservation repository for archival material as well as a site for online public access. It has demonstrated its ability to reliably deliver content during sudden spikes in demand, such as during the release of the federal budget, which now is distributed primarily in a digital format.
But, “We are not resting on our laurels,” said GPO’s Chief Technology Officer Ric Davis. A major refresh of FDsys now is in the planning stages, which will include an updated search engine and improved support for mobile devices.
FDsys uses Extensible Markup Language (XML) and an ISO standard format for archival information to enable searching across multiple collections, a feature not available in the original GPO website, GPO Access.
“It was really a flat store of files with a search engine on top of it,” LaPlant said of the old Access. “We needed something to better manage and preserve.”
GPO Access was mandated in 1993 to provide public access to GPO products and went online in 1994. It was developed in the infancy of online government, and “it served us very well for 15 years,” Davis said. “But it did have some limitations.” It used a search engine developed in-house, and the site’s only function was to serve up documents over the Internet. By 2000 it was clear that it was in need of an update, and a decision was made to develop a complete replacement.
Planning for FDsys began in 2005 and it was launched in January of 2009, running alongside GPO Access for a year before the older site was retired in 2010.
“To us it seemed like blinding speed,” Davis said of the four-year development process.
The development team used the agile development process of iterative and incremental software development, a cycle that includes planning, development, integration, testing and deployment.
“The part that was really critical was developing requirements,” Davis said. The development team consulted with all stakeholders in the project, including Federal Depository librarians, congressional staff and other agencies that would use the system. The goal was not new technology, but serving the stakeholders. “First of all, you want to do no harm,” he said. The idea was to keep users happy while adding new features. Success required having a clear vision to define the framework.
“There was a lot of involvement from subject matter experts” who produce and use the documents in developing requirements, LaPlant said.
Because GPO must ensure that its digital material will remain available in the future, through both FDsys and the 1,200 libraries in the Federal Depository Library Program, the framework uses the Open Archival Information System, a reference model from the International Organization for Standardization that specifies the concepts for a digital preservation system.
“We knew that was something that would fit our needs,” LaPlant said. A basic element of the model is the Archival Information Package in which both primary material and its metadata are stored as a single object. Managing data and metadata together allows information to be easily searched and to be transferred as one package.
Metadata is provided in XML, which allows documents to be encoded in a format that is human and machine readable, making it ideal for documents being searched for and accessed through the Internet. XML has become the go-to format for information in digital government initiatives, part of the Obama administration’s policy of leveraging the Web and virtual environments for delivery of government products and services.
In 2012 GPO began replacing its 30-year-old composition engine called Microcomp with XML Professional Publisher (XPP) to enable the direct XML formatting of documents for both electronic and print publication. This eliminated the step of transforming documents for publication in XML.
The decision was made to develop and host FDsys in-house because GPO staff already was familiar with the agency’s systems, the material being produced and the needs of FDsys. With the shift at GPO away from primarily ink-on-paper to electronic publishing, “we had the technical capability to handle it ourselves,” Davis said, although, “I wouldn’t say finding the staff was easy.”
GPO uses a commercial content delivery network for reliable response and to accommodate bursts in demand, but hosts FDsys on its premises, as well as a backup site for continuity of operations and disaster recovery.
The agency is evaluating cloud-based technology for FDsys as part of its upcoming major refresh, along with a new an open source search engine, Solr, which promises fault tolerant performance on a large scale.
A new user interface for mobile devices also is in the works. The FDsys site now is accessible through mobile browsers, “but you have to do a lot of pinching and scrolling,” LaPlant said. A mobile interface has always been on the FDsys roadmap, and with the rapid adoption of the devices, “we saw it was time to do it.”
With FDsys established as a scalable and reliable, GPO also is offering it to other agencies as a platform for archiving and disseminating material. It already is hosting such diverse materials as historic Treasury Department documents, audio recordings of radio traffic from Air Force One following the JFK assassination, and President’ Nixon’s Watergate grand jury testimony. GPO wants to expand these services, possibly to include online sites branded for client agencies.
“It depends on the needs of the client agency,” Davis said.