Querying XML streams

  • Authors:
  • Vanja Josifovski;Marcus Fontoura;Attila Barta

  • Affiliations:
  • IBM Almaden Research Center, USA;IBM Almaden Research Center, USA;Department of Computer Science, University of Toronto, Canada

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficient querying of XML streams will be one of the fundamental features of next-generation information systems. In this paper we propose the TurboXPath path processor, which accepts a language equivalent to a subset of the for-let-where constructs of XQuery over a single document. TurboXPath can be extended to provide full XQuery support or used to augment federated database engines for efficient handling of queries over XML data streams produced by external sources. Internally, TurboXPath uses a tree-shaped path expression with multiple outputs to drive the execution. The result of a query execution is a sequence of tuples of XML fragments matching the output nodes. Based on a streamed execution model, TurboXPath scales up to large documents and has limited memory consumption for increased concurrency. Experimental evaluation of a prototype demonstrates performance gains compared to other state-of-the-art path processors.