Automated error detection using association rules

Authors:
Waqas Ahmed Malik;Antony Unwin
Affiliations:
Department of Computer Oriented Statistics and Data Analysis, University of Augsburg, Augsburg, Germany;Department of Computer Oriented Statistics and Data Analysis, University of Augsburg, Augsburg, Germany
Venue:
Intelligent Data Analysis
Year:
2011

Citing 14
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A new and versatile method for association generation

Information Systems
Quality information and knowledge

Quality information and knowledge
Algorithms for association rule mining — a general survey and comparison

ACM SIGKDD Explorations Newsletter
Discretization: An Enabling Technique

Data Mining and Knowledge Discovery
AIMQ: a methodology for information quality assessment

Information and Management
Data Quality: The Accuracy Dimension

Data Quality: The Accuracy Dimension
Mining Association Rules: Deriving a Superior Algorithm by Analyzing Today's Approaches

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)

Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Handbook of Data Visualization (Springer Handbooks of Computational Statistics)

Handbook of Data Visualization (Springer Handbooks of Computational Statistics)
Interactive Graphics for Data Analysis: Principles and Examples (Computer Science and Data Analysis)

Interactive Graphics for Data Analysis: Principles and Examples (Computer Science and Data Analysis)
Unsupervised discretization using kernel density estimation

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

High data quality is important for every application. Inaccurate or inadequate data can lead to inappropriate assumptions, misleading results, bias and ultimately poor policy and decision making. Finding errors and cleaning data is a time consuming process. This paper presents a framework for automatically detecting unusual and erroneous data values in datasets. The main idea is to generate association rules with very high confidence and to identify the cases that are exceptions to these rules. Experimental results show that the proposed framework is able to successfully identify erroneous values in large datasets.