Boston puts open-data quality first
- By Stephanie Kanowitz
- Mar 20, 2017
Boston’s new cloud-hosted open-data website, Analyze Boston, focuses on making data usable and accessible, not just available. That’s why officials didn’t simply move all their datasets from the previous portal to the new one.
“We want to make the site into a place where people can go to get information and knowledge, not just data,” said Andrew Therriault, Boston’s chief data officer.
To figure out what data to release, the city formed a Content Advisory Group consisting of internal and external stakeholders to the city to “really help us dig us through what is a good dataset and what should be made public,” said Howard Lim, Boston’s open data product manager. For Boston, a high-quality dataset has two requirements, he said. One is “making a connection between the data publisher and the data consumer by providing metadata, and two, high-quality also means datasets are up-to-date and readily accessible and really convey the time frame for when a dataset is relevant.”
To that end, when Analyze Boston was released March 1 in beta form, it had 117 datasets – fewer than the previous portal. But that’s misleading, Therriault said, because to prepare the data from the old site for the new one, his team combined multiple datasets based on the same topic but from different time periods into a single dataset. For example, employee earnings reports used to be in separate datasets for each year. Now they’re separate files in the same dataset.
Powered by CKAN-based OpenGov’s Open Data platform and hosted in the cloud, Analyze Boston also improves on users’ ability to create applications based on the data. Although the previous portal had application programming interface access, users didn’t find it responsive enough to use, Therriault said. The fact that the data sometimes was inconsistent in terms of timeliness and completeness also discouraged use, he added.
Now, he said, “we are really investing in making sure that the data we’re publishing there has a long-term plan to be sustained and in a consistent fashion, where people can know that it’s going to be available and be up-to-date when they need it.”
Therriault sees two main audiences for the open data portal: external users and Boston government users. For example, the new website showcases how users are interacting with the data -- information that's tailored to the in-government audience. There’s a budget application that shows how operating and capital budgets are allocated across city departments and a map of capital projects planned for the fiscal year. There’s also the BuildBPS Dashboard, housed on a web-based platform and providing analysis tools and data visualizations for school buildings.
Analyze Boston will officially launch this spring, and there’s a roadmap in place for the datasets that will be released in the next six months.
“We’ve been actively partnering with the police department and our [Emergency Medical Services] system to look at public safety-related data to get as much published on there as possible,” Therriault said. The data team has also been working with the city’s Environment Department to understand utility usage by city properties.
With this website, Boston also changed how it licenses its data -- a move the Sunlight Foundation has praised. Previously the standard was a Creative Commons attribution license, which was “fairly permissive,” Therriault said, but it attached some stipulations and could be problematic for users combining the data with other information or applications to build derivative products because they had to ensure that the data had the proper attribution all the way down the line.
“We decided to make the default license just a public domain license so that there are no restrictions on usage,” Therriault said. “People can do whatever they want with our data. If they want to attribute it to us, great, but we’re trying to be as open as possible to make it easier for this data to be used elsewhere.”
Boston’s effort to make data more usable to the public and government users is a trend across governments, said Joel Natividad, OpenGov’s director of open data. “It has to have utility not just for the public. It has to have utility for government itself,” he said.
“Once you’ve got that data available, allowing government folks to use it just as much as citizens is probably going to drive more value in the long term,” added Michael Schanker, head of marketing at OpenGov.
Stephanie Kanowitz is a freelance writer based in northern Virginia.