Indexing XML documents for XPath query processing in external memory

  • Authors:
  • Qun Chen;Andrew Lim;Kian Win Ong;Jiqing Tang

  • Affiliations:
  • Department of Industrial Engineering and Engineering Management, Hong Kong University of Science and Technology, Kowloon, Hong Kong;Department of Industrial Engineering and Engineering Management, Hong Kong University of Science and Technology, Kowloon, Hong Kong;Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA;Department of Industrial Engineering and Engineering Management, Hong Kong University of Science and Technology, Kowloon, Hong Kong

  • Venue:
  • Data & Knowledge Engineering - Special issue: ER 2003
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

Existing encoding schemes and index structures proposed for XML query processing primarily target the containment relationship, specifically the parent-child and ancestor-descendant relationship. The presence of preceding-sibling and following-sibling location steps in the XPath specification, which is the de facto query language for XML, makes the horizontal navigation, besides the vertical navigation, among nodes of XML documents a necessity for efficient evaluation of XML queries. Our work enhances the existing range-based and prefix-based encoding schemes such that all structural relationships between XML nodes can be determined from their codes alone. Furthermore, an external-memory index structure based on the traditional B+-tree, XL+-tree(XML Location+-tree), is introduced to index element sets such that all defined location steps in the XPath language, vertical and horizontal, top-down and bottom-up, can be processed efficiently. The XL+-trees under the range or prefix encoding scheme actually share the same structure; but various search operations upon them may be slightly different as a result of the richer information provided by the prefix encoding scheme. Finally, experiments are conducted to validate the efficiency of the XL+-tree approach. We compare the query performance of XL+-tree with that of R-tree, which is capable of handling comprehensive XPath location steps and has been empirically shown to outperform other indexing approaches.