Data quality through knowledge engineering

  • Authors:
  • Tamraparni Dasu;Gregg T. Vesonder;Jon R. Wright

  • Affiliations:
  • AT&T Labs - Research, Florham Park, NJ;AT&T Labs - Research, Florham Park, NJ;AT&T Labs - Research, Florham Park, NJ

  • Venue:
  • Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditionally, data quality programs have acted as a preprocessing stage to make data suitable for a data mining or analysis operation. Recently, data quality concepts have been applied to databases that support business operations such as provisioning and billing. Incorporating business rules that drive operations and their associated data processes is critically important to the success of such projects. However, there are many practical complications. For example, documentation on business rules is often meager. Rules change frequently. Domain knowledge is often fragmented across experts, and those experts do not always agree. Typically, rules have to be gathered from subject matter experts iteratively, and are discovered out of logical or procedural sequence, like a jigsaw puzzle. Our approach is to impement business rules as constraints on data in a classical expert system formalism sometimes called production rules. Our system works by allowing good data to pass through a system of constraints unchecked. Bad data violate constraints and are flagged, and then fed back after correction. Constraints are added incrementally as better understanding of the business rules is gained. We include a real-life case study.