Staying FIT: efficient load shedding techniques for distributed stream processing

  • Authors:
  • Nesime Tatbul;Uǧur Çetintemel;Stan Zdonik

  • Affiliations:
  • ETH Zurich;Brown University;Brown University

  • Venue:
  • VLDB '07 Proceedings of the 33rd international conference on Very large data bases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In distributed stream processing environments, large numbers of continuous queries are distributed onto multiple servers. When one or more of these servers become overloaded due to bursty data arrival, excessive load needs to be shed in order to preserve low latency for the query results. Because of the load dependencies among the servers, load shedding decisions on these servers must be well-coordinated to achieve end-to-end control on the output quality. In this paper, we model the distributed load shedding problem as a linear optimization problem, for which we propose two alternative solution approaches: a solver-based centralized approach, and a distributed approach based on metadata aggregation and propagation, whose centralized implementation is also available. Both of our solutions are based on generating a series of load shedding plans in advance, to be used under certain input load conditions. We have implemented our techniques as part of the Borealis distributed stream processing system. We present experimental results from our prototype implementation showing the performance of these techniques under different input and query workloads.