Automaton meets algebra: a hybrid paradigm for XML stream processing

  • Authors:
  • Hong Su;Elke A. Rundensteiner;Murali Mani

  • Affiliations:
  • CS Department, Worcester Polytechnic Institute, Worcester, MA;CS Department, Worcester Polytechnic Institute, Worcester, MA;CS Department, Worcester Polytechnic Institute, Worcester, MA

  • Venue:
  • Data & Knowledge Engineering - Special issue: ER 2003
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data streams. The automata paradigm is naturally suited for pattern recognition on tokenized XML streams, but requires patches for fulfilling the filtering or restructuring functionalities in the XML query language. In contrast, the algebraic paradigm is a well-established technique for processing self-contained tuples. It however does not traditionally support token inputs. The Raindrop framework is the first to accommodate these two paradigms within one algebraic framework, taking advantage of both. This paper describes the overall framework, highlighting in particular three aspects. First, we describe how the tokens and tuples are modeled in one uniform query processing model. Second, we present the query rewriting that switches computations between these two data models. Third, we discuss strategies for the implementation and synchronization of the operators within the framework. We report experimental results that illustrate the unique optimization opportunities offered by this novel framework.