From searching text to querying XML streams

  • Authors:
  • Dan Suciu

  • Affiliations:
  • University of Washington, Department of Computer Science, 114 Sieg Hall, Box 352350, Seattle, WA

  • Venue:
  • Journal of Discrete Algorithms - SPIRE 2002
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML data is queried with a limited form of regular expressions, in a language called XPath. New XML stream processing applications, such as content-based routing or selective dissemination of information, require thousands or millions of XPath expressions to be evaluated simultaneously on the incoming XML stream at a high, sustained rate. In its simplest approximation, the XPath evaluation problem is analogous to the text search problem, in which one or several regular expressions need to be matched to a given text. At a finer level, it is related to the tree pattern matching problem. However, unlike the traditional setting, the number of regular expressions here is much larger, while the "text" is much shorter, since it corresponds to the depth of the XML stream. In this paper we examine techniques that have been proposed for XML stream processing and describe a few open problems.