Data modeling for the masses

Site offers data sets for anyone who wants them

The Internet is a vast sea of data, and one of the major information streams contributing to it comes from the U.S. government. Much of this material is in the form of datasets, such as regional economic figures, weather charts and geological survey data.

Last year, the Obama administration launched the Data.gov site as a part of its Open Government Initiative. The goal of the site is to make it much easier for the public to access, download and use federal government datasets. Data.gov provides descriptions of government datasets, information about how to access the information and the tools to make the best use of the datasets.

To meet its goal of promoting public participation and collaboration in government, the site provides downloadable federal datasets to build applications, conduct analysis and perform research. The site features three catalogs of downloadble datasets organized under raw data, tools and geodata. Users can scroll through the catalog lists and select searches by subject and/or federal agency. There is also a tool for ranking the usefulness of a given metadata set. Searches can also be conducted by keywords.

For an example of what a completed dataset looks like, Data.gov also features a page displaying 51 complete datasets and tools, including a U.S. Geological Survey Global Visualization Viewer for Aerial and Satellite Data and FBI-compiled national crime statistics for 2007.

The site’s newest tool is the GEO Viewer, an interactive mapping application designed to allow users to preview geospatial data available through Data.gov’s catalogs. The tool lets users view datasets on an interactive map, overlay datasets with other datasets and explore the underlying information.

Data.gov is also actively involved in the International Open Government Data Conference that is wrapping up today in Washington D.C. As the New York Times reported, at the heart of Data.gov’s efforts is a team of data curators at the Rensselaer Polytechnic Institute. Led by James Hendler, the team is responsible for the datasets featured on the website. Hendler, Tetherless World Professor of Computer and Cognitive Science, and the Assistant Dean for Information Technology and Web Science at R.P.I, is one of the conference speakers.

Hendler told the Times that one of the goals behind Data.gov’s efforts is to make the design and building of interactive data sites as easy as setting up a website. He noted that while the capability is not quite there yet, it is only a few years away.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above