Closed world data exchange

  • Authors:
  • André Hernich;Leonid Libkin;Nicole Schweikardt

  • Affiliations:
  • Humboldt-Universität zu Berlin, Germany;University of Edinburgh, United Kingdom, Edinburgh, UK;Goethe-Universität Frankfurt am Main, Germany

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data exchange deals with translating data structured in some source format into data structured in some target format, given a specification of the relationship between the source and the target and possibly constraints on the target; and answering queries over the target in a way that is semantically consistent with the information in the source. Theoretical foundations of data exchange have been actively explored recently. It was also noticed that the standard semantics for query answering in data exchange may lead to counterintuitive or anomalous answers. In the present article, we explain that this behavior is due to the fact that solutions can contain invented information (information that is not related to the source instance), and that the presence of incomplete information in target instances has been ignored. In particular, proper query evaluation techniques for databases with nulls have not been used, and the distinction between closed and open world semantics has not been made. We present a concept of solutions, called CWA-solutions, that is based on the closed world assumption. For data exchange settings without constraints on the target, the space of CWA-solutions has two extreme points: the canonical universal solution (the maximal CWA-solution) and the core of the universal solutions (the minimal CWA-solution), both of them well studied in data exchange. In the presence of constraints on the target, the core of the universal solutions is still the minimal CWA-solution, but there may be no unique maximal CWA-solution. We show how to define the semantics of query-answering taking into account incomplete information, and show that some of the well-known anomalies go away with the new semantics. The article also contains results on the complexity of query-answering, upper approximations to queries (maybe-answers), and various extensions.