Data Integration through ${\textit{DL-Lite}_{\mathcal A}}$ Ontologies

  • Authors:
  • Diego Calvanese;Giuseppe Giacomo;Domenico Lembo;Maurizio Lenzerini;Antonella Poggi;Riccardo Rosati;Marco Ruzzi

  • Affiliations:
  • Faculty of Computer Science, Free University of Bozen-Bolzano,;Dip. di Informatica e Sistemistica, SAPIENZA Università di Roma,;Dip. di Informatica e Sistemistica, SAPIENZA Università di Roma,;Dip. di Informatica e Sistemistica, SAPIENZA Università di Roma,;Dip. di Informatica e Sistemistica, SAPIENZA Università di Roma,;Dip. di Informatica e Sistemistica, SAPIENZA Università di Roma,;Dip. di Informatica e Sistemistica, SAPIENZA Università di Roma,

  • Venue:
  • Semantics in Data and Knowledge Bases
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The goal of data integration is to provide a uniform access to a set of heterogeneous data sources, freeing the user from the knowledge about where the data are, how they are stored, and how they can be accessed. One of the outcomes of the research work carried out on data integration in the last years is a clear conceptual architecture, comprising a global schema, the source schema, and the mapping between the source and the global schema. In this paper, we present a comprehensive approach to, and a complete system for, ontology-based data integration. In this system, the global schema is expressed in terms of a TBox of the tractable Description Logics ${\textit{DL-Lite}_{\mathcal A}}$, the sources are relations, and the mapping language allows for expressing GAV sound mappings between the sources and the global schema. The mapping language has specific mechanisms for addressing the so-called impedance mismatch problem, arising from the fact that, while the data sources store values, the instances of concepts in the ontology are objects. By virtue of the careful design of the various languages used in our system, answering unions of conjunctive queries can be done through a very efficient technique (LogSpacewith respect to data complexity) which reduces this task to standard SQL query evaluation. We also show that even very slight extensions of the expressive abilities of our system lead beyond this complexity bound.