On label stream partition for efficient holistic twig join

  • Authors:
  • Bo Chen;Tok Wang Ling;M. Tamer Özsu;Zhenzhou Zhu

  • Affiliations:
  • School of Computing, National University of Singapore;School of Computing, National University of Singapore;David R. Cheriton School of Computer Science, University of Waterloo;School of Computing, National University of Singapore

  • Venue:
  • DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Label stream partition is a useful technique to reduce the input I/O cost of holistic twig join by pruning useless streams beforehand. The Prefix Path Stream (PPS) partition scheme is effective for non-recursive XML documents, but inefficient for deep recursive XML documents due to the high CPU cost of pruning and merging too many streams for some twig pattern queries involving recursive tags. In this paper, we propose a general stream partition scheme called Recursive Path Stream (RPS), to control the total number of streams while providing pruning power. In particular, each recursive path in RPS represents a set of prefix paths which can be recursively expanded from the recursive path. We present the algorithms to build RPS scheme and prune RPS streams for queries. We also discuss the adaptability of RPS and provide a framework for performance tuning with general RPS based on different application requirements.