Rule mining for automatic ontology based data cleaning

  • Authors:
  • Stefan Brüggemann

  • Affiliations:
  • Institute for Information Technology, Oldenburg, Germany

  • Venue:
  • APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic detection and removal of inconsistencies in data are open challenges in the data quality management cycle. Specific knowledge is needed to clean invalid data, which often requires user interaction with domain experts. Domain specific classes and attributes can be described in ontologies. Attribute value combinations can be labeled as valid or invalid. Our approach on data cleaning allows for detection and removal of semantic errors in data. The analysis of replacements enables the creation of rules, which can minimize the required user interaction. We provide an algorithm which analyzes frequencies of replacement operations for invalid tuples in the ontology and generates rules, which are then applied in data cleaning environments automatically.