Fast and Highly-Available Stream Processing over Wide Area Networks

  • Authors:
  • Jeong-Hyon Hwang;Ugur Cetintemel;Stan Zdonik

  • Affiliations:
  • Department of Computer Science, Brown University. jhhwang@cs.brown.edu;Department of Computer Science, Brown University. ugur@cs.brown.edu;Department of Computer Science, Brown University. sbz@cs.brown.edu

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a replication-based approach that realizes both fast and highly-available stream processing over wide area networks. In our approach, multiple operator replicas send outputs to each downstream replica so that it can use whichever data arrives first. To further expedite the data flow, replicas run independently, possibly processing data in different orders. Despite this complication, our approach always delivers what non-replicated processing would produce without failures. We call this guarantee replication transparency. In this paper, we first discuss semantic issues for replication transparency and extend stream-processing primitives accordingly. Next, we develop an algorithm that manages replicas at geographically dispersed servers. This algorithm strives to achieve the best latency guarantee, relative to the cost of replication. Finally, we substantiate the utility of our work through experiments on PlanetLab servers as well as simulations based on real network traces.