Towards an algebraic theory of information integration

  • Authors:
  • Gösta Grahne;Victoria Kiricenko

  • Affiliations:
  • Department of Computer Science, Concordia University Montreal, Que., Canada H3G 1M8;Department of Computer Science, Concordia University Montreal, Que., Canada H3G 1M8

  • Venue:
  • Information and Computation
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information integration systems provide uniform interfaces to varieties of heterogeneous information sources. For query answering in such systems, the current generation of query answering algorithms in local-as-view (source-centric) information integration systems all produce what has been thought of as ''the best obtainable'' answer, given the circumstances that the source-centric approach introduces incomplete information into the virtual global relations. However, this ''best obtainable'' answer does not include all information that can be extracted from the sources because it does not allow partial information. Neither does the ''best obtainable'' answer allow for composition of queries, meaning that querying a result of a previous query will not be equivalentto the composition of the two queries. In this paper, we provide a foundation for information integration, based on the algebraic theory of incomplete information. Our framework allows us to define the semantics of partial facts and introduce the notion of the exact answer-that is the answer that includes partial facts. We show that querying under the exact answer semantics is compositional. We also present two methods for actually computing the exact answer. The first method is tableau-based, and it is a generalization of the ''inverse-rules'' approach. The second, much more efficient method, is a generalization of the rewriting approach, and it is based on partial containment mappings introduced in the paper.