Picking the right portal to make open data accessible
- By Amanda Ziadeh
- Jul 10, 2015
As government agencies move to make more data publicly available, choosing the right tool for the data is key.
While simple, already-structured or static data that doesn’t need visualization can be posted in any number of ways, other datasets need special handling in order to be truly accessible and useful. A number of commonly used and adaptable open data portals for the public sector were explored in a recent webinar with Dale Lutz and Stewart Harper of Safe Software, a data conversion and integration software company.
Enterprise open source
CKAN is a popular open source data portal that provides helpful tools for streamlining, publishing, sharing, finding and using datasets.
Intended for large enterprise datasets, CKAN has more than 300 open source data management extensions that are constantly evolving. CKAN allows users to host, configure and deploy the data, so typically a skilled IT team is needed for deployment. Features include a fast search experience, easy data uploading and the ability to plot geographic data in an interactive map.
For Data.gov, CKAN works as a data harvester, pulling data from other agencies like the Department of Agriculture and NASA, federating the data into one searchable catalog.
DKAN, a derivative of CKAN, offers a plugin for Drupal, an open source content management system with the option for cloud-hosting. It is simple to deploy and maintain, and can be self-hosted through GitHub.
ArcGIS Open Data is a go-to solution for Esri software users because the open data builds directly on top of already published ArcGIS services.
ArcGIS Server and ArcGIS Online allow the configuration and federation of geodata into an open data portal. Data and metadata can be viewed in the browser, and users can interact with the data and download it in several formats.
ArcGIS offers a wealth of mapping options for geodata, but does not have other advanced visualization tools. There are ways to create charts and simple tools to view and interact with the datasets, however, and advanced search and filtration options are user-friendly.
Other visualization options
Organizations that want more data visualization should consider services like Junar, Socrata and OpenDataSoft.
Junar is an easy-to-use, software-as-a-service open data cloud platform that focuses on powerful analysis and visualizations. It offers a range of APIs to enable developers and users to integrate data back into their own applications, and is currently used for open data portals by the cities of Sacramento and Palo Alto.
Socrata can host significantly large datasets, both published and working copies, and currently has 5.5 million records.
Users can publish to Socrata using a desktop sync tool or APIs; data can also be uploaded natively as CSV files, Excel files or TSV files. The portal offers support for shapefiles as well, such as KML, KMZ and GeoJSON.
Socrata has tools structured around metadata management and workflow, like filter tools to narrow the information, export data, conduct analytics, create visualizations – like charts and map overlays – and view the data from a spatial perspective.
Another feature is the data-lens experience, which shows a different layout of the data so users can gain insights, inspect and interrogate information without having to actually load the set. Interactive graphs, for example, can show trends.
Chicago uses Socrata for its public data portal of 5.8 million records of crime data dating back to 2001. The New York Police Department also uses Socrata to publish and publicly display crash and collision data.
OpenDataSoft also allows for interaction and visualization through automated API generation. The platform is easy to use, works well with large datasets, supports geospatial formats, leverages Elasticsearch and ensures near real-time search and analysis.
Publishing and management of data are easy with live dashboards and the OpenDataSoft display is designed for display on mobile devices.
Amanda Ziadeh is a former reporter/producer for GCN.