The newly launched open source search stack lets GSA’s DigitalGov Search perform real-time analytics and create dashboards to monitor Web search trends.
DigitalGov Search, formerly USASearch, offers a free, hosted-search service for federal, state and local government sites. The General Services Administration’s Office of Citizen Services and Innovative Technologies provides the capability as part of its DigitalGov platform, which aims to help agencies offer digital services to the public.
Ammie Feijoo, DigitalGov Search’s manager, said the open source technology, which has been in place for a couple of weeks, includes Elasticsearch Inc.’s ELK stack. The ELK stack consists of Elasticsearch, a real-time search and analytics engine; Logstash, a log management tool; and Kibana, a data visualization engine for creating dashboards.
The GSA’s search technology is, in essence, a big data initiative in that it aims to make government data sets more accessible to the public. The effort to bolster search supports the agency’s goal of “continually improving the interaction and experience” when citizens connect to the government, noted Martha Dorris, deputy associate administrator at GSA’s Office of Citizen Services and Innovative Technologies.
That the search box has become a key vehicle for communication has not been lost on other agencies looking to improve their digital offerings to the public. The National Library of Medicine, for instance, retooled its Web search capabilities using IBM’s InfoSphere Data Explorer software following complaints about poor quality search results, according to an IBM case study published in September.
Open search at no cost
GSA’s use of open source helps it provide search technology at no cost to agencies. And since GSA offers DigitalGov Search as a hosted service, customers need not concern themselves with the intricacies of the technology. “Agencies don’t have to keep up with search, which is a rapidly changing industry, and can provide good, relevant search results on websites,” Feijoo said.
Federal agency websites using GSA’s hosted service to power their search boxes include Defense.gov, DHS.gov and Treasury.gov. State government customers span the United States from Hawaii to Maine.
The newly launched open source search stack lets DigitalGov Search perform real-time analytics and create dashboards to monitor Web search trends. And there’s ample search data to analyze given that users are querying DigitalGov Search boxes to the tune of 24 million searches per month.
And with the dashboard-building capability, Feijoo said, she is “able to surface trends that I may not have otherwise seen when they are buried into the data.”
As a consequence, DigitalGov Search can notify agencies when a particular query is trending, Feijoo said. Based on that alert, agencies may want to promote certain content or create new content that answers the question, she added.
In general, real-time analytics makes DigitalGov Search more nimble in providing data back to the websites’ search results pages to answer public inquires, Feijoo said.
According to the DigitalGov Search website, GSA’s hosted search service also provides customers the following capabilities:
- Support for an “unlimited number of Web pages across an unlimited number of domains” as well as the ability to integrate tweets, YouTube videos and Flickr photos.
- The ability to customize the search experience, using DigitalGov Search’s Admin Center to review search analytics, tell GSA what content to index, set up the display of the results page and curate recommended pages.
- 99.95 percent-plus uptime and the ability to deliver search results in less than half a second.
Open source history
While this particular open source stack is new, GSA is not new to open source search in general. The agency has been using open source Hadoop in its backend search system since 2010. Hadoop, the Apache Foundation’s distributed computing framework, is associated with big data projects. The Hadoop Distributed File System distributes large data sets across servers in a Hadoop cluster. In addition, associated Apache Hive software provides data warehousing and analytics capabilities for Hadoop clusters.
Feijoo said DigitalGov Search will use Hadoop and its newer open source search technology at the same time to provide different answers. The ELK stack provides her -- and other users on the content side -- with easier access to the data, she said, contrasting content-oriented users with developers and others more familiar with a harder-to-use tool such as Hadoop.
“The ELK stack, and in particular Kibana, allows content-oriented users like me the ability to create ad-hoc dashboards to see and interact with my data, without doing any coding or having to know any query language,” Feijoo said.
In a March blog post, Feijoo outlined DigitalGov Search’s open source strategy, which will include contributing to open source projects. Her organization will pursue a “share first” approach, she said, in which code for every new feature will be written so that any party can use the code. In addition, DigitalGov Search plans to expose its application programming interfaces so “anyone can make use of the data, not just searchers on a government websites.”
NEXT STORY: DHS expands animal disease surveillance project