Data exchange: semantics and query answering

  • Authors:
  • Ronald Fagin;Phokion G. Kolaitis;Renée J. Miller;Lucian Popa

  • Affiliations:
  • IBM Almaden Research Center;University of California at Santa Cruz;University of Toronto;IBM Almaden Research Center

  • Venue:
  • Theoretical Computer Science - Database theory
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to the query answering problem in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem.We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. We show that a universal solution has no more and no less data than required for data exchange and that it represents the entire space of possible solutions. We then identify fairly general, yet practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently. We adopt the notion of the "certain answers" in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also address other algorithmic issues that arise in data exchange. In particular, we study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution, and we explore the boundary of what queries can and cannot be answered this way, in a data exchange setting.