IG: Coding error is key to delays on tax database
- By Mary Mosquera
- Jul 31, 2003
'These delays are particularly disturbing, especially since the General Accounting Office continues to view modernization as a high-risk area.'
'IRS Commissioner Mark Everson
J. Adam Fenster
The latest delay in rolling out a modernized IRS database can be traced back to a contractor mistake made two years ago, a Treasury Department auditor says.
The tax agency last month pushed back deployment of the Customer Account Data Engine to the 2004 tax filing season, around March or April. The move follows several delays since 2001 that have postponed the system.
CADE is designed to replace the 40-year-old magnetic tape Master File with a relational database to hold the records of more than 200 million taxpayers. The IRS will be able to update CADE in real time rather than have to submit files for batch processing, as it does for the Master File.
The lead contractor for the IRS Business Systems Modernization, Computer Sciences Corp., made costly errors in developing CADE, said Margaret Begg, acting assistant inspector general for audit and information system programs with the Treasury Inspector General for Tax Administration.
CSC underestimated the complexity of the process and may still, Begg said.
CSC two years ago failed to include 30,000 lines of code that would let the Master File program work with CADE, which led to the first major delay in 2001, she said. 'The reconciliation part of the balancing and control program is probably the root cause of the CADE delays,' Begg said, referring to the process in the Master File program in which records are posted and checked for accuracy.
The reconciliation phase of making sure taxpayer information is accurately posted has to be done for the Master File as it exists and then carried over to CADE, Begg said. CSC recognized the problem, she said, but 'getting it fixed is a much more tedious effort than originally realized.'
CADE's accuracy depends upon the Master File data being accurate. The two systems will run in tandem as CADE expands its operations through five releases. The first release, to bring over the records of about 6 million Form 1040EZ filers, had been set for this month.
The IRS said it pushed the rollout back to next year after deciding the August date was too close to the period when the agency would be making tax code changes for the 2004 filing season.Software setbacks
The setbacks prompted IRS commissioner Mark Everson to launch an independent review of the program to evaluate its progress and determine whether changes are needed. 'These delays are particularly disturbing, especially since the General Accounting Office continues to view modernization as a high-risk area,' he said in a statement.
Following the setbacks in 2001, the IRS renegotiated the cost-plus contract with the Prime alliance, the consortium of contractors led by CSC working on IRS modernization, to a firm-fixed price contract.
Although CSC fixed the 30,000 lines of software code, problems have lingered. 'The IRS has no assurance that this problem has been accurately corrected,' Begg said.
CSC tried to move ahead with other aspects of CADE so it could begin testing and release the pilot. In so doing, glitches that remained in the balancing and control components were 'magnified over time,' she said.
A CSC spokesman said the company was confident that CADE's first phase will be in place during the 2004 tax season but declined further comment.
On Everson's request, the Software Engineering Institute at Carnegie Mellon University will review the project and report its findings to the IRS in 60 to 90 days. SEI, a federally funded R&D center sponsored by the Defense Department, has performed many technical assessments of systems for agencies, said Brian Gallagher, the institute's acquisition support program director.
The IRS Oversight Board will also assess CADE, reviewing the contractors' performance and the agency's management of the Prime contract over its four-year history.
Former IRS CIO John Reece said the main source of CADE's repeated delays has been the sheer complexity of getting the old and new systems to work together and the amount of testing the IRS demands.
'CADE's first release's functionality accounts for only about 12 percent of the effort. But 88 percent of it is building the foundations for future releases of CADE and the bridges back to the original system,' said Reece, who stepped down as the agency's CIO in April and now runs a technology consulting firm.Dual-use code
The first release will establish the forms, update instructions, interface and organization of the system. CADE will more quickly provide accurate account information and speed up refunds.
The old and new systems will have to communicate with each other. If CADE, for example, can't handle a specific return for some reason, it will revert to the Master File system. 'We have to get the connectivity and data passing back and forth,' Reece said.
The source of the project's complexity, he said, is that code has to be written to operate with both systems. The taxpayer history also has to be put in the database.
Based on his time as IRS CIO, Reece said, he is convinced the technology is going to work.
'But we were more optimistic than we should have been. We did not appreciate the number of unknowns that we were going to meet up with,' he said.
CADE must be tested until no defects remain, Reece said. That includes the software code for bridges, the logic embedded in CADE so it knows when to send the data back to the Master File. 'I've never seen a testing process as thorough, complete and rigorous as this one,' he said.
This year, the IRS budgeted $33 million for CADE out of the total $422 million for all modernization efforts. For next year, the agency currently plans to spend $84 million on CADE out of $458 million.
Mary Mosquera is a reporter for Federal Computer Week.