Rule mining for automatic ontology based data cleaning

Authors:
Stefan Brüggemann
Affiliations:
Institute for Information Technology, Oldenburg, Germany
Venue:
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Year:
2008

Citing 7
Cited 1

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A translation approach to portable ontology specifications

Knowledge Acquisition - Special issue: Current issues in knowledge modeling
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Query processing in the SIMS information mediator

Readings in agents
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Using ontologies for XML data cleaning

OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems

Interchangeable consistency constraints for public health care systems

Proceedings of the 2010 ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic detection and removal of inconsistencies in data are open challenges in the data quality management cycle. Specific knowledge is needed to clean invalid data, which often requires user interaction with domain experts. Domain specific classes and attributes can be described in ontologies. Attribute value combinations can be labeled as valid or invalid. Our approach on data cleaning allows for detection and removal of semantic errors in data. The analysis of replacements enables the creation of rules, which can minimize the required user interaction. We provide an algorithm which analyzes frequencies of replacement operations for invalid tuples in the ontology and generates rules, which are then applied in data cleaning environments automatically.