DATA MANAGEMENT
Standards body issues draft advisory on maintaining open government data
While building out publicly facing data repositories, government
agencies should cleanly separate the user interface layer from the data
being presented, the World Wide Web Consortium (W3C) advises in a draft report on maintaining open government data.
"External parties can create new and exciting interfaces that may
not be obvious to the data publishers. For that reason, do not
compromise the integrity of the data to create flashy interfaces," the
report states. "If you must create an interface, then publish the data
separate from the interface and ensure external parties have direct
access to the raw data, so they can build their own interfaces if they
wish."
The paper, titled "Publishing Open Government Data," is one of the first deliverables from the newly formed W3C eGovernment Interest Group, the mission of which is to help governments around the world share standards-based best practices.
The creation of the report was influenced, in part, from feedback provided by a meeting the group held in March
in Washington to solicit ideas from federal agencies. After sifting
through the feedback it gets in this report and other endeavors, the
group may also eventually issue government-specific standards to help
better use the Web, noted group liaison Sandro Hawke, in an interview
with GCN.
Responding to the new administration's call for greater transparency, agencies may be eager to post more of their data online.
"The quickest and easiest way to make data available on the Internet
is to publish the data in its raw form," the report states, adding that
the data format used should be machine-readable, such as by being
encoded in the Extensible Markup Language, Resource Description
Framework or laid out in a Comma Separated Values file. By structuring
data in this fashion, third-party computers can better reorganize and
reuse the data. "Formats that only allow the data to be seen, rather
than extracted (for example, pictures of the data), are not useful and
should be avoided."
The working group advises, however, that the user interface be
separate from the underlying data set. The idea is to make agency Web
sites act like "file servers," the paper states. Tools, such as the Extensible Stylesheet Language, can render data sets into forms that are easy for humans to scan on a Web page.
Agencies should take some additional steps to ensure the material
gets into the right hands, the report suggested. Data directories, such
as Data.Gov should be set up, which
will allow third-parties to peruse the contents. Documents should be
given permanent Uniform Resource Locators or Identifiers, so that they
will be able to be accessed through the years.
While building out publicly facing data repositories, government
agencies should cleanly separate the user interface layer from the data
being presented, the World Wide Web Consortium (W3C) advises in a draft report on maintaining open government data.
"External parties can create new and exciting interfaces that may
not be obvious to the data publishers. For that reason, do not
compromise the integrity of the data to create flashy interfaces," the
report states. "If you must create an interface, then publish the data
separate from the interface and ensure external parties have direct
access to the raw data, so they can build their own interfaces if they
wish."
The paper, titled "Publishing Open Government Data," is one of the first deliverables from the newly formed W3C eGovernment Interest Group, the mission of which is to help governments around the world share standards-based best practices.
The creation of the report was influenced, in part, from feedback provided by a meeting the group held in March
in Washington to solicit ideas from federal agencies. After sifting
through the feedback it gets in this report and other endeavors, the
group may also eventually issue government-specific standards to help
better use the Web, noted group liaison Sandro Hawke, in an interview
with GCN.
Responding to the new administration's call for greater transparency, agencies may be eager to post more of their data online.
"The quickest and easiest way to make data available on the Internet
is to publish the data in its raw form," the report states, adding that
the data format used should be machine-readable, such as by being
encoded in the Extensible Markup Language, Resource Description
Framework or laid out in a Comma Separated Values file. By structuring
data in this fashion, third-party computers can better reorganize and
reuse the data. "Formats that only allow the data to be seen, rather
than extracted (for example, pictures of the data), are not useful and
should be avoided."
The working group advises, however, that the user interface be
separate from the underlying data set. The idea is to make agency Web
sites act like "file servers," the paper states. Tools, such as the Extensible Stylesheet Language, can render data sets into forms that are easy for humans to scan on a Web page.
Agencies should take some additional steps to ensure the material
gets into the right hands, the report suggested. Data directories, such
as Data.Gov should be set up, which
will allow third-parties to peruse the contents. Documents should be
given permanent Uniform Resource Locators or Identifiers, so that they
will be able to be accessed through the years.