A decomposition-based probabilistic framework for estimating the selectivity of XML twig queries

  • Authors:
  • Chao Wang;Srinivasan Parthasarathy;Ruoming Jin

  • Affiliations:
  • Department of Computer Science and Engineering, The Ohio State University;Department of Computer Science and Engineering, The Ohio State University;Department of Computer Science and Engineering, The Ohio State University

  • Venue:
  • EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a novel approach for estimating the selectivity of XML twig queries. Such a technique is useful for answering approximate queries as well as for determining an optimal query plan for complex queries based on said estimates. Our approach relies on a summary structure that contains the occurrence statistics of small twigs. We rely on a novel probabilistic approach for decomposing larger twig queries into smaller ones. We then show how it can be used to estimate the selectivity of the larger query in conjunction with the summary information. We present and evaluate different strategies for decomposition and compare this work against a state-of-the-art selectivity estimation approach on synthetic and real datasets. The experimental results show that our proposed approach is very effective in estimating the selectivity of XML twig queries.