Data contamination and row-level error identification

  • Authors:
  • C. Sophie Lee

  • Affiliations:
  • Department of Information Systems, College of Business Administration, California State University, Long Beach, CA

  • Venue:
  • ICAI'09 Proceedings of the 10th WSEAS international conference on Automation & information
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Massive data are processed by business organizations daily. Errors in upstream dataspace or faulty processes may already contaminate tens of thousands of data records downstream by the time the errors are detected and corrected. Many data space possesses the Strictly Downstream Dataspace property where an overall refresh of the dataspace is infeasible, and row level error identification is required. Such identification is usually done manually. It is time consuming and error prone. This paper models the error contamination process and proposes a design to quickly identify row-level error space in the system.