Querying Data under Access Limitations

  • Authors:
  • Andrea Cali;Davide Martinenghi

  • Affiliations:
  • Computing Laboratory and Oxford-Man Institute of Quantitative Finance, The University of Oxford, Wolfson Building, Parks Road, Oxford OX1 3QD, United Kingdom/ Faculty of Computer S;Dipartimento di Elettronica e Informazione, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy/ Faculty of Computer Science, Free University of Bozen-Bolzano,

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data sources on the web are often accessible through web interfaces that present them as relational tables, but require certain attributes to be mandatorily selected, e.g., via a web form. In a scenario where we integrate a set of such sources, and we pose queries over them, the values needed to access a source may have to be retrieved from other sources that are possibly not even mentioned in the query: answering queries at best can then be done only with a potentially recursive query plan that gets all obtainable answers to the query. Since data sources are typically distributed over a network, a major cost indicator for the execution of a query plan is the number of accesses to remote sources. In this paper we present an optimization technique for conjunctive queries that produces a query plan that: (1) minimizes the number of accesses according to a strong notion of minimality; (2) excludes all sources that are not relevant for the query. We introduce Toorjah, a prototype system that answers queries posed on sources with limitations by means of optimized query plans. Toorjah adopts a strategy that is aimed to retrieve answers as early as possible during query processing, and to present them to the user as they are computed. We provide experimental evidence of the effectiveness of our optimization, by showing the reduction of the number of accesses in a large number of cases.