XML data fusion

  • Authors:
  • Frantchesco Cecchin;Cristina Dutra De Aguiar Ciferri;Carmem Satie Hara

  • Affiliations:
  • Federal University of Paraná, Curitiba, PR, Brazil;University of São Paulo, São Carlos, SP, Brazil;Federal University of Paraná, Curitiba, PR, Brazil

  • Venue:
  • DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensuring high quality data when collecting and integrating information from heterogeneous sources into a data warehouse is a challenging problem. In this paper, we propose a model for XML data fusion, which allows the integrator to define data cleaning rules for solving value conflicts that may have been detected during the integration process. These rules resemble decisions that are made by users when data are manually curated and, once defined, conflicts detected in subsequent integration processes that are within the context of existing rules can be automatically solved without user intervention. We also introduce a notion of fusion policy validation that prevents conflicting resolution rules to be defined. To validate our proposal, we developed XFusion, a rulebased cleaning tool that stores curated data in a integrated repository.