AF officer: Warehouse only good data

AF officer: Warehouse only good data

Avoid compounding systems flaws by starting with well-scrubbed files, he says

BY By Bao T. Nguyen

Special to GCN

Editor's Note: The opinions expressed by the writer, an Air Force data administrator, are his own and not necessarily those of the service.

Government organizations need to consider an alternative to building data warehouses. A better approach is to build an architecture and solve data problems at the source.

Many organizations expect a data warehouse to provide capabilities that are not available in a legacy environment, but they often cannot use a warehouse without cleaning data, and that raises costs.

Common warehousing problems include systems that misread the data fed into them, lack of standardization, and missing or buried data. An organization that overestimates the quality of its data might have to abandon a warehouse project.

Data reflects only organizational details. Information puts data in context and gives it meaning. Data and information alone will not be sufficient in the 21st century. Knowledge will be necessary to synthesize information to improve an organization's understanding of itself, its resources and its mission.

Data warehousing and client-server systems have helped transform data into information. They have integrated advances in database management systems, data access languages and communications facilities. But these technologies by themselves do not produce the right information at the right time, at the right place, in the right form.

Get that data

To consistently transform information into knowledge, an architectural approach is necessary. A data warehouse extracts data from legacy production and online transaction processing (OLTP) systems.

Its data is less than timely because it is constrained by legacy systems.

Interfaced processes remain basically unchanged. Often the benefits are not clearly defined or, when they are, rarely meet expectations or justify the cost.

Maintenance and enhancement costs tend to increase. Warehousing makes the environment more, not less, complex. Interfaces, code, and the cost and time it takes to change just about anything invariably grow.

Until the late 1980s, the information needs of managers had been ignored in favor of meeting the needs of OLTP. Decision support architectures then emerged, made up of data warehouses and analysis tools.

Just as OLTP software is used against operational databases, online analytical processing software is used in a decision support environment against data warehouses. But the warehouse environment differs significantly from the operational environment.

The two should not be mixed, according to Bill Inmon, a warehousing pioneer and chief technology officer of Pine Cone Systems Inc. of Englewood, Colo.''

Inmon contends that operational data is different from analytical data. Operational data has different users and fundamentally different processing characteristics.

The time lag of warehoused data from legacy sources is a fatal flaw because the decision-makers expect full integration'not just summaries or histories.

Organizations that build multiple data warehouses wind up multiplying the redundancy, inconsistency and cost. Even warehousing advocates admit a failure rate as high as 60 percent to 80 percent.

What organizations need instead is shared, subject-oriented, stable, nonredundant data.

Data warehouses, though expensive, do have benefits if an organization has a long-term goal. They engage managers' interest in maintaining information systems.

If the data is clean, the managers can make better decisions. If warehousing reveals poor data quality, its visibility wins management support for fixing the problems.

If there is no legacy environment, a warehouse can be set up correctly from scratch, presenting an opportunity to create an enterprise model and identify the data elements.

Common elements

Standards are the key to interconnectivity in most industries, including information processing. Data standards need a data hub, but standards can be applied only when the value of interoperability is obvious and the number of databases is reduced.

There are five steps that government organizations can take to achieve an architecture-based environment:

'Model the functional process, the conceptual data, the conceptual transaction, the conceptual distribution and the conceptual technical model.

'Take control of the existing architecture. Treat information systems as resources, whether liabilities or assets. Log and track them through their lifecycles. Implement a system resource management process.

'Identify enterprise data standards where the value of interoperability exceeds the value of uniqueness.

'Prioritize. Rebuild information systems that create a need for data before rebuilding the systems that use the data.

'Eliminate links between legacy systems, then replace the legacy systems.

The challenge of the next decade is to integrate business processes with applications and data. Meeting it will require an enterprise architecture.

Bao T. Nguyen is a senior officer on the Air Force chief information officer's support staff. He has been a data administrator for the Air Force and the Naval Research Laboratory.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected