Condensative stream query language for data streams

  • Authors:
  • Lisha Ma;Werner Nutt;Hamish Taylor

  • Affiliations:
  • Heriot-Watt University, Edinburgh, UK;Free University of Bozen-Bolzano, Italy;Heriot-Watt University, Edinburgh, UK

  • Venue:
  • ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In contrast to traditional database queries, a query on stream data is continuous in that it is periodically evaluated over fractions (sliding windows) of the data stream. This introduces challenges beyond those encountered when processing traditional queries. Over a traditional DBMS (Database Management System), the answer to an aggregate query is usually much smaller than the answer to a similar non-aggregate query making query processing condensative. Current proposals for declarative query languages over data streams do not support such condensative processing. Nor is it yet well understood what query constructs and what semantics should be adopted for continuous query languages. In order to make existing stream query languages more expressive, a novel stream query language CSQL (Condensative Stream Query Language) are proposed over a sequence-based stream model (Ma & Nutt 2005). It is shown that the sequence model supports a precise tuple-based semantics that is lacking in previous time-based models, and thereby provides a formal semantics to understand and reason about continuous queries. CSQL supports sliding window operators found in previous languages and processes a declarative semantics that allows one to specify and reason about the different meanings of the frequency by which a query returns answer tuples, which are beyond previous query languages over streams. In addition, a novel condensative stream algebra is defined by extending an existing stream algebra with a new frequency operator, to capture the condensative property. It is shown that a condensative stream algebra enables the generation of efficient continuous query plans, and can be used to validate query optimisation. Finally, it is shown via an experimental study that the proposed operators are effective and efficient in practice.