Providing resiliency to load variations in distributed stream processing

  • Authors:
  • Ying Xing;Jeong-Hyon Hwang;Uǧur Çetintemel;Stan Zdonik

  • Affiliations:
  • Department of Computer Science, Brown University;Department of Computer Science, Brown University;Department of Computer Science, Brown University;Department of Computer Science, Brown University

  • Venue:
  • VLDB '06 Proceedings of the 32nd international conference on Very large data bases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scalability in stream processing systems can be achieved by using a cluster of computing devices. The processing burden can, thus, be distributed among the nodes by partitioning the query graph. The specific operator placement plan can have a huge impact on performance. Previous work has focused on how to move query operators dynamically in reaction to load changes in order to keep the load balanced. Operator movement is too expensive to alleviate short-term bursts; moreover, some systems do not support the ability to move operators dynamically. In this paper, we develop algorithms for selecting an operator placement plan that is resilient to changes in load. In other words, we assume that operators cannot move, therefore, we try to place them in such a way that the resulting system will be able to withstand the largest set of input rate combinations. We call this a resilient placement.This paper first formalizes the problem for operators that exhibit linear load characteristics (e.g., filter, aggregate), and introduces a resilient placement algorithm. We then show how we can extend our algorithm to take advantage of additional workload information (such as known minimum input stream rates). We further show how this approach can be extended to operators that exhibit non-linear load characteristics (e.g., join). Finally, we present prototype- and simulation-based experiments that quantify the benefits of our approach over existing techniques using real network traffic traces.