Adaptive Processing of Top-k Queries in XML

  • Authors:
  • Amelie Marian;Sihem Amer-Yahia;Nick Koudas;Divesh Srivastava

  • Affiliations:
  • Columbia University;AT&T Labs-Research;AT&T Labs-Research;AT&T Labs-Research

  • Venue:
  • ICDE '05 Proceedings of the 21st International Conference on Data Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ability to compute top-k matches to XML queries is gaining importance due to the increasing number of large XML repositories. The efficiency of top-k query evaluation relies on using scores to prune irrelevant answers as early as possible in the evaluation process. In this context, evaluating the same query plan for all answers might be too rigid because, at any time in the evaluation, answers have gone through the same number and sequence of operations, which limits the speed at which scores grow. Therefore, adaptive query processing that permits different plans for different partial matches and maximizes the best scores is more appropriate. In this paper, we propose an architecture and adaptive algorithms for efficiently computing top-k matches to XML queries. Our techniques can be used to evaluate both exact and approximate matches where approximation is defined by relaxing XPath axes. In order to compute the scores of query answers, we extend the traditional tf*idf measure to account for document structure. We conduct extensive experiments on a variety of benchmark data and queries, and demonstrate the usefulness of the adaptive approach for computing top-k queries in XML.