XSeq: an indexing infrastructure for tree pattern queries

  • Authors:
  • Xiaofeng Meng;Yu Jiang;Yan Chen;Haixun Wang

  • Affiliations:
  • Renmin University of China, Beijing, China;Renmin University of China, Beijing, China;Renmin University of China, Beijing, China;IBM T. J. Watson Research Center, Hawthorne, NY

  • Venue:
  • SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a tree-pattern query, most XML indexing approaches decompose it into multiple sub-queries, and then join their results to provide the answer to the original query. Join operations have been identified as the most time-consuming component in XML query processing. XSeq is a powerful XML indexing infrastructure which makes tree patterns a first class citizen in XML query processing. Unlike most indexing methods that directly manipulate tree structures, XSeq builds its indexing infrastructure on a much simpler data model: sequences. That is, we represent both XML data and XML queries by structure-encoded sequences. We have shown that this new data representation preserves query equivalence, and more importantly, through subsequence matching, structured queries can be answered directly without resorting to expensive join operations. Moreover, the XSeq infrastructure unifies indices on both the content and the structure of XML documents, hence it achieves an additional performance advantage over methods indexing either just content or structure, or indexing them separately.