Meteor Shower: A Reliable Stream Processing System for Commodity Data Centers

  • Authors:
  • Huayong Wang;Li-Shiuan Peh;Emmanouil Koukoumidis;Shao Tao;Mun Choon Chan

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale failures are commonplace in commodity data centers, the major platforms for Distributed Stream Processing Systems (DSPSs). Yet, most DSPSs can only handle single-node failures. Here, we propose Meteor Shower, a new fault-tolerant DSPS that overcomes large-scale burst failures while improving overall performance. Meteor Shower is based on checkpoints. Unlike previous schemes, Meteor Shower orchestrates operators' check pointing activities through tokens. The tokens originate from source operators, trickle down the stream graph, triggering each operator that receives these tokens to checkpoint its own state. Meteor Shower is a suite of three new techniques: 1) source preservation, 2) parallel, asynchronous check pointing, and 3) application-aware check pointing. Source preservation allows Meteor Shower to avoid the overhead of redundant tuple saving in prior schemes, parallel, asynchronous check pointing enables Meter Shower operators to continue processing streams during a checkpoint, while application-aware check pointing lets Meteor Shower learn the changing pattern of operators' state size and initiate checkpoints only when the state size is minimal. All three techniques together enable Meteor Shower to improve throughput by 226% and lower latency by 57% vs prior state-of-the-art. Our results were measured on a prototype implementation running three real world applications in the Amazon EC2 Cloud.