Clustered Chain Path Index for XML Document: Efficiently Processing Branch Queries

  • Authors:
  • Hongqiang Wang;Jianzhong Li;Hongzhi Wang

  • Affiliations:
  • School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China 150001;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China 150001;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China 150001

  • Venue:
  • World Wide Web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Branch query processing is a core operation of XML query processing. In recent years, a number of stack based twig join algorithms have been proposed to process twig queries based on tag stream index. However, in tag stream index, each element is labeled separately without considering the similarity among elements. Besides, algorithms based on tag stream index perform inefficiently on large document. This paper proposes a novel index, named Clustered Chain Path Index, based on a novel labeling scheme. This index provides efficient support for processing branch queries. It also has the same cardinality as 1-index against tree structured XML document. Based on CCPI, efficient algorithms, KMP-Match-Path and Related-Path-Segment-Join, are proposed to process queries efficiently. Analysis and experimental results show that proposed query processing algorithms based on CCPI outperform other algorithms and have good scalability.