Database support for processing complex aggregate queries over data streams

  • Authors:
  • Yuanzhen Ji

  • Affiliations:
  • SAP Research Dresden, Dresden

  • Venue:
  • Proceedings of the Joint EDBT/ICDT 2013 Workshops
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the last few years, the increasing demand on processing streaming data with high throughput and low latency has led to the development of specialized stream processing engines (SPE). Although existing SPEs show high performance in evaluating stateless operations and stateful operations with small windows, their performance degrades significantly when calculating exact answers for complex aggregate queries with huge windows. Examples include correlated aggregations, quantile and ordering statistic computation. Meanwhile, modern database systems have demonstrated the ability of processing complex analytical tasks efficiently over very large datasets, using technologies such as vertical storage, vectorized query execution, etc. This suggests the feasibility of leveraging database systems to assist SPEs to process complex aggregate queries to reduce their evaluation latency. The goal of this thesis is to investigate the potential of combining database systems with SPEs in the context of stream processing so as to improve the overall query evaluation performance. To this end, the following two major topics will be addressed in this thesis: (1) dynamic migration of complex aggregate operations between the SPE and the database in response to varying system load and (2) efficient evaluation of continuous queries over streaming data that is migrated to the database.