Operator scheduling in a data stream manager

  • Authors:
  • Don Carney;Uğur Çetintemel;Alex Rasin;Stan Zdonik;Mitch Cherniack;Mike Stonebraker

  • Affiliations:
  • Department of Computer Science, Brown University;Department of Computer Science, Brown University;Department of Computer Science, Brown University;Department of Computer Science, Brown University;Department of Computer Science, Brandeis University;Laboratory for Computer Science & Department of EECS, M.I.T.

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many stream-based applications have sophisticated data processing requirements and real-time performance expectations that need to be met under high-volume, time-varying data streams. In order to address these challenges, we propose novel operator scheduling approaches that specify (1) which operators to schedule (2) in which order to schedule the operators, and (3) how many tuples to process at each execution step. We study our approaches in the context of the Aurora data stream manager. We argue that a fine-grained scheduling approach in combination with various scheduling techniques (such as batching of operators and tuples) can significantly improve system efficiency by reducing various system overheads. We also discuss application-aware extensions that make scheduling decisions according to per-application Quality of Service (QoS) specifications. Finally, we present prototype-based experimental results that characterize the efficiency and effectiveness of our approaches under various stream workloads and processing scenarios.