Holistic Join for Generalized Tree Patterns

  • Authors:
  • Prakash Ramanan

  • Affiliations:
  • Department of Computer Science, Wichita State University, Wichita, KS 67260-0083, USA

  • Venue:
  • Information Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We consider the problem of evaluating an XQuery query Q (involving only child and descendant axes) on an XML document D. D is stored on a disk and is read from there, in document order. Chen et al. [From Tree Patterns to Generalized Tree Patterns: on efficient evaluation of XQuery, Proceedings of International Conference on Very Large Data Bases (VLDB), 2003, pp. 237-248] presented an algorithm to convert Q (from a large fragment of XQuery) into a Generalized Tree PatternGTP(Q), and a set J(Q) of value join conditions on its vertices. Evaluating Q on D reduces to finding the matches for GTP(Q) in D. We present an efficient algorithm for finding these matches. Excluding the computation of the value joins J(Q), our algorithm performs two linear passes over the data, and runs in O(d|Q|) memory space, where d denotes the depth of D; runtime and disk I/O are O(|Q@?D|). If separate input streams of document nodes for the individual vertices in GTP(Q) are available, our runtime and disk I/O are linear in the input size; this runtime and disk I/O are trivially optimal.