EPA CIO tackles data quality flaws in indexing project

IT workers and data gatherers at EPA
offices will help write the data accuracy plan, agency CIO Al Pesachowitz said.

The Environmental Protection Agency’s chief information officer has a new charge:
improve the data quality of EPA’s Sector Facility Indexing Project.

“There has been a big controversy about the SFIP’s data quality,” CIO Al
Pesachowitz said. “Either the sources or the states have made mistakes in reporting
or correcting the records. We have been working with both reporting entities to ensure the
SFIP is as clean as possible.”

EPA Administrator Carol M. Browner gave Pesachowitz until Sept. 30 to create a plan to
ensure data accuracy. The indexing project is part of EPA’s Reinventing Environmental
Information Program, through which the agency is modernizing its 13 major systems. The
five-year plan began in January.

Fixing the data gathering may require hardware and software upgrades, Pesachowitz said,
but he could not say yet what form upgrades might take. Information technology workers and
data gatherers at EPA program offices will help write the plan, Pesachowitz said.

“The team will make an assessment of the current data system, prioritize
improvements to the data quality and data corrections process, correct the data in the
current system and consider future improvements,” he said.

The indexing project gathers data on the environmental performance of 653 facilities
for five major industries. EPA makes the information available to the public via the Web.
The information comes from inspection reports and enforcement actions taken against

The indexing project is a compilation of other databases that analysts can use to
monitor industry compliance, EPA compliance director Elaine Stanley said. The information
in the databases comes from inspections by state and federal agencies under the Clean Air
Act of 1970, the Clean Water Act of 1972, the Resource Conservation and Recovery Act of
1976, and the Emergency Planning and Community Right-to-Know Act of 1986.

The index taps databases running at EPA’s National Computer Center in Research
Triangle Park, N.C. EPA houses the database in an IBM 9021 mainframe that has 10 CPUs and
a processing speed of 470 million instructions per second, said Walter Shackelford, chief
of enterprise technology operations at the center.

Users can download indexing project data through the EPA Web site at http://www.epa.gov/oeca/sfi.

The system’s effectiveness rests on the data’s accuracy, Stanley said.

“The project was controversial from the beginning because it provided public
access to information in one location, which had never been done before,” Stanley
said. “This meant we also had a responsibility to make the SFIP information

Stay Connected

Sign up for our newsletter.

I agree to this site's Privacy Policy.