Automated error detection using association rules

  • Authors:
  • Waqas Ahmed Malik;Antony Unwin

  • Affiliations:
  • Department of Computer Oriented Statistics and Data Analysis, University of Augsburg, Augsburg, Germany;Department of Computer Oriented Statistics and Data Analysis, University of Augsburg, Augsburg, Germany

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

High data quality is important for every application. Inaccurate or inadequate data can lead to inappropriate assumptions, misleading results, bias and ultimately poor policy and decision making. Finding errors and cleaning data is a time consuming process. This paper presents a framework for automatically detecting unusual and erroneous data values in datasets. The main idea is to generate association rules with very high confidence and to identify the cases that are exceptions to these rules. Experimental results show that the proposed framework is able to successfully identify erroneous values in large datasets.