Index structures for matching XML twigs using relational query processors

  • Authors:
  • Zhiyuan Chen;Johannes Gehrke;Flip Korn;Nick Koudas;Jayavel Shanmugasundaram;Divesh Srivastava

  • Affiliations:
  • Information Systems Department, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, United States;Department of Computer Sciences, Cornell University, 4105B Upson Hall, Ithaca, NY 14853, United States;AT&T Labs-Research, 180 Park Ave, P.O. Box 971, Florham Park, NJ 07932-0971, United States;Department of Computer Science, Bahen Center for Information Technology, University of Toronto, 40 St. George Street Rm BA5240, Toronto ON M5S 2E4, Canada;Department of Computer Sciences, Cornell University, 4105B Upson Hall, Ithaca, NY 14853, United States;AT&T Labs-Research, 180 Park Ave, P.O. Box 971, Florham Park, NJ 07932-0971, United States

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Various index structures have been proposed to speed up the evaluation of XML path expressions. However, existing XML path indices suffer from at least one of three limitations: they focus only on indexing the structure (relying on a separate index for node content), they are useful only for simple path expressions such as root-to-leaf paths, or they cannot be tightly integrated with a relational query processor. Moreover, there is no unified framework to compare these index structures. In this paper, we present a framework defining a family of index structures that includes most existing XML path indices. We also propose two novel index structures in this family, with different space-time tradeoffs, that are effective for the evaluation of XML branching path expressions (i.e., twigs) with value conditions. We also show how this family of index structures can be implemented using the access methods of the underlying relational database system. Finally, we present an experimental evaluation that shows the performance tradeoff between index space and matching time. The experimental results show that our novel indices achieve orders of magnitude improvement in performance for evaluating twig queries, albeit at a higher space cost, over the use of previously proposed XML path indices that can be tightly integrated with a relational query processor.