Optimizing sorting and duplicate elimination in XQuery path expressions

  • Authors:
  • Mary Fernández;Jan Hidders;Philippe Michiels;Jérôme Siméon;Roel Vercammen

  • Affiliations:
  • AT&T Labs Research;University of Antwerp;University of Antwerp;IBM T.J. Watson Research Center;University of Antwerp

  • Venue:
  • DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

XQuery expressions can manipulate two kinds of order: document order and sequence order. While the user can impose or observe the order of items within a sequence, the results of path expressions must always be returned in document order. Correctness can be obtained by inserting explicit (and expensive) operations to sort and remove duplicates after each XPath step. However, many such operations are redundant. In this paper, we present a systematic approach to remove unnecessary sorting and duplicate elimination operations in path expressions in XQuery 1.0. The technique uses an automaton-based algorithm which we have applied successfully to path expressions within a complete XQuery implementation. Experimental results show that the algorithm detects and eliminates most redundant sorting and duplicate elimination operators and is very effective on common XQuery path expressions.