Correspondence and translation for heterogeneous data

  • Authors:
  • Serge Abiteboul;Sophie Cluet;Tova Milo

  • Affiliations:
  • Tel Aviv Univ., Tel Aviv, Israel;Tel Aviv Univ., Tel Aviv, Israel;Tel Aviv Univ., Tel Aviv, Israel

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2002

Quantified Score

Hi-index 5.23

Visualization

Abstract

Data integration often requires a clean abstraction of the different formats in which data are stored, and means for specifying the correspondences/relationships between data in different worlds and for translating data from one world to another. For that, we introduce in this paper a middleware data model that serves as a basis for the integration task, and a declarative rules language for specifying the integration. We show that using the language, correspondences between data elements can be computed in polynomial time in many cases, and may require exponential time only when insensitivity to order or duplicates are considered. Furthermore, we show that in most practical cases the correspondence rules can be automatically turned into translation rules to map data from one representation to another. Thus, a complete integration task (derivation of correspondences, transformation of data from one world to the other, incremental integration of a new bulk of data, etc.) can be specified using a single set of declarative rules.