Semantics and evaluation techniques for window aggregates in data streams

  • Authors:
  • Jin Li;David Maier;Kristin Tufte;Vassilis Papadimos;Peter A. Tucker

  • Affiliations:
  • Portland State University, Portland, OR;Portland State University, Portland, OR;Portland State University, Portland, OR;Portland State University, Portland, OR;Whitworth College, Spokane, WA

  • Venue:
  • Proceedings of the 2005 ACM SIGMOD international conference on Management of data
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A windowed query operator breaks a data stream into possibly overlapping subsets of data and computes a result over each. Many stream systems can evaluate window aggregate queries. However, current stream systems suffer from a lack of an explicit definition of window semantics. As a result, their implementations unnecessarily confuse window definition with physical stream properties. This confusion complicates the stream system, and even worse, can hurt performance both in terms of memory usage and execution time. To address this problem, we propose a framework for defining window semantics, which can be used to express almost all types of windows of which we are aware, and which is easily extensible to other types of windows that may occur in the future. Based on this definition, we explore a one-pass query evaluation strategy, the Window-ID (WID) approach, for various types of window aggregate queries. WID significantly reduces both required memory space and execution time for a large class of window definitions. In addition, WID can leverage punctuations to gracefully handle disorder. Our experimental study shows that WID has better execution-time performance than existing window aggregate query evaluation options that retain and reprocess tuples, and has better latency-accuracy tradeoffs for disordered input streams compared to using a fixed delay for handling disorder.