Discretized streams: fault-tolerant streaming computation at scale

Authors:
Matei Zaharia;Tathagata Das;Haoyuan Li;Timothy Hunter;Scott Shenker;Ion Stoica
Affiliations:
University of California, Berkeley;University of California, Berkeley;University of California, Berkeley;University of California, Berkeley;University of California, Berkeley;University of California, Berkeley
Venue:
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Year:
2013

Citing 26
Cited 2

STREAM: the stanford stream data manager (demonstration description)

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Highly available, fault-tolerant, parallel dataflows

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
High-Availability Algorithms for Distributed Stream Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Flexible time management in data stream systems

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Monitoring streams: a new class of data management applications

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Fault-tolerance in the borealis distributed stream processing system

ACM Transactions on Database Systems (TODS)
Microsoft CEP server and online behavioral targeting

Proceedings of the VLDB Endowment
Stateful bulk processing for incremental analytics

Proceedings of the 1st ACM symposium on Cloud computing
Comet: batched stream processing for data intensive distributed computing

Proceedings of the 1st ACM symposium on Cloud computing
Continuous analytics over discontinuous streams

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Large-scale incremental processing using distributed transactions and notifications

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
S4: Distributed Stream Computing Platform

ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
In-situ MapReduce for log processing

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Design and Evaluation of a Real-Time URL Spam Filtering Service

SP '11 Proceedings of the 2011 IEEE Symposium on Security and Privacy
Incoop: MapReduce for incremental computations

Proceedings of the 2nd ACM Symposium on Cloud Computing
Scaling the mobile millennium system in the cloud

Proceedings of the 2nd ACM Symposium on Cloud Computing
Fast crash recovery in RAMCloud

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Meteor Shower: A Reliable Stream Processing System for Commodity Data Centers

IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
TimeStream: reliable stream computation in the cloud

Proceedings of the 8th ACM European Conference on Computer Systems
Naiad: a timely dataflow system

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Naiad: a timely dataflow system

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many "big data" applications must act on data in real time. Running these applications at ever-larger scales requires parallel platforms that automatically handle faults and stragglers. Unfortunately, current distributed stream processing models provide fault recovery in an expensive manner, requiring hot replication or long recovery times, and do not handle stragglers. We propose a new processing model, discretized streams (D-Streams), that overcomes these challenges. D-Streams enable a parallel recovery mechanism that improves efficiency over traditional replication and backup schemes, and tolerates stragglers. We show that they support a rich set of operators while attaining high per-node throughput similar to single-node systems, linear scaling to 100 nodes, sub-second latency, and sub-second fault recovery. Finally, D-Streams can easily be composed with batch and interactive query models like MapReduce, enabling rich applications that combine these modes. We implement D-Streams in a system called Spark Streaming.