Towards scalable data integration under constraints

  • Authors:
  • George Konstantinidis;Jose Luis Ambite

  • Affiliations:
  • University of Southern California, Marina Del Rey, CA;University of Southern California, Marina Del Rey, CA

  • Venue:
  • Proceedings of the 2012 Joint EDBT/ICDT Workshops
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we consider the problem of answering queries using views, with or without ontological constraints, which is important for data integration, query optimization, and data warehouses. Our context is data integration, so we search for maximally-contained rewritings. We have produced a very scalable and efficient solution for its simplest form, conjunctive queries and views, and we are working towards the full relational case. When considering constraints, the problem is usually divided in two phases: (1) query expansion, which rewrites queries w. r. t. the intentional knowledge and (2) expanded query reformulation using the views. Relevant algorithms have given little attention to the second phase and have studied a limited form of view definition languages overall (namely, only GAV). By looking at the problem from a graph perspective we are able to gain a better insight and develop designs which compactly represent common patterns in the source descriptions, and (optionally) push some computation offline. This allows us to contribute significantly in both aforemention phases individually, tailor one to each other, and moreover address them in a unified way. We intend to provide a solution that supports a variety of ontology languages, and all prevalent view definition languages (G/LAV). Towards such a general and scalable system our preliminary results for the relational case, show an experimental performance about two orders of magnitude faster than current state-of-the-art algorithms, rewriting queries using over 10000 views within seconds.