FIX: feature-based indexing technique for XML documents

  • Authors:
  • Ning Zhang;M. Tamer Özsu;Ihab F. Ilyas;Ashraf Aboulnaga

  • Affiliations:
  • University of Waterloo;University of Waterloo;University of Waterloo;University of Waterloo

  • Venue:
  • VLDB '06 Proceedings of the 32nd international conference on Very large data bases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of features based on its structural properties. These features are used as keys for the patterns and stored in a B+tree. Given an XPath query, its feature vector is first calculated and looked up in the index. Then a further refinement phase is performed to fetch the final results. We experimentally study the indexing technique over both synthetic and real data sets. Our experiments show that FIX provides great pruning power and could gain an order of magnitude performance improvement for many XPath queries over existing evaluation techniques.