Fault-tolerance in the borealis distributed stream processing system

Authors:
Magdalena Balazinska;Hari Balakrishnan;Samuel R. Madden;Michael Stonebraker
Affiliations:
University of Washington, Seattle, WA;Massachusetts Institute of Technology, Cambridge, MA;Massachusetts Institute of Technology, Cambridge, MA;Massachusetts Institute of Technology, Cambridge, MA
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2008

Citing 31
Cited 17

How to assign votes in a distributed system

Journal of the ACM (JACM)
Implementing recoverable requests using queues

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
An overview of real-time database systems

Advances in real-time systems
Managing update conflicts in Bayou, a weakly connected replicated storage system

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The dangers of replication and a solution

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Eventually-serializable data services

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Scheduling transactions with stringent real-time constraints

Information Systems
Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services

ACM SIGACT News
Partial results for online query processing

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A survey of rollback-recovery protocols in message-passing systems

ACM Computing Surveys (CSUR)
Transaction Processing: Concepts and Techniques

Transaction Processing: Concepts and Techniques
Lessons from Giant-Scale Services

IEEE Internet Computing
Providing High Availability in Very Large Worklflow Management Systems

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Weighted voting for replicated data

SOSP '79 Proceedings of the seventh ACM symposium on Operating systems principles
Distributed data management in workflow environments

RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
A theory of redo recovery

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A Nonpreemptive Real-Time Scheduler with Recovery from Transient Faults and Its Implementation

IEEE Transactions on Software Engineering
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Approximate replication

Approximate replication
Highly available, fault-tolerant, parallel dataflows

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Replicated document management in a group communication system

CSCW '88 Proceedings of the 1988 ACM conference on Computer-supported cooperative work
Progress in Real-Time Fault Tolerance

SRDS '04 Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems
Retrospective on Aurora

The VLDB Journal — The International Journal on Very Large Data Bases
High-Availability Algorithms for Distributed Stream Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Flexible time management in data stream systems

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimistic replication

ACM Computing Surveys (CSUR)
Fault-tolerance in the Borealis distributed stream processing system

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Fault-tolerance and load management in a distributed stream processing system

Fault-tolerance and load management in a distributed stream processing system
Optimal Replica Placement under TTL-Based Consistency

IEEE Transactions on Parallel and Distributed Systems

SmartCIS: integrating digital and physical environments

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Vision for Next Generation Query Processors and an Associated Research Agenda

Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
DEDUCE: at the intersection of MapReduce and stream processing

Proceedings of the 13th International Conference on Extending Database Technology
Parallelizing XML data-streaming workflows via MapReduce

Journal of Computer and System Sciences
SmartCIS: integrating digital and physical environments

ACM SIGMOD Record
RAFT at work: speeding-up mapreduce applications under task and node failures

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Fault injection-based assessment of partial fault tolerance in stream processing applications

Proceedings of the 5th ACM international conference on Distributed event-based system
Satisfaction-based query replication

Distributed and Parallel Databases
Processing flows of information: From data stream to complex event processing

ACM Computing Surveys (CSUR)
Efficient optimization and processing for distributed monitoring and control applications

PhD '12 Proceedings of the on SIGMOD/PODS 2012 PhD Symposium
Stormy: an elastic and highly available streaming service in the cloud

Proceedings of the 2012 Joint EDBT/ICDT Workshops
Parallelizing stateful operators in a distributed stream processing system: how, should you and how much?

Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters

HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Failover and takeover contingency mechanisms for network partition and node failure

Proceedings of the eleventh ACM SIGPLAN workshop on Erlang workshop
Declarative distributed advertisement system for iDTV: an industrial experience

Proceedings of the 14th symposium on Principles and practice of declarative programming
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Discretized streams: fault-tolerant streaming computation at scale

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the past few years, Stream Processing Engines (SPEs) have emerged as a new class of software systems, enabling low latency processing of streams of data arriving at high rates. As SPEs mature and get used in monitoring applications that must continuously run (e.g., in network security monitoring), a significant challenge arises: SPEs must be able to handle various software and hardware faults that occur, masking them to provide high availability (HA). In this article, we develop, implement, and evaluate DPC (Delay, Process, and Correct), a protocol to handle crash failures of processing nodes and network failures in a distributed SPE. Like previous approaches to HA, DPC uses replication and masks many types of node and network failures. In the presence of network partitions, the designer of any replication system faces a choice between providing availability or data consistency across the replicas. In DPC, this choice is made explicit: the user specifies an availability bound (no result should be delayed by more than a specified delay threshold even under failure if the corresponding input is available), and DPC attempts to minimize the resulting inconsistency between replicas (not all of which might have seen the input data) while meeting the given delay threshold. Although conceptually simple, the DPC protocol tolerates the occurrence of multiple simultaneous failures as well as any further failures that occur during recovery. This article describes DPC and its implementation in the Borealis SPE. We show that DPC enables a distributed SPE to maintain low-latency processing at all times, while also achieving eventual consistency, where applications eventually receive the complete and correct output streams. Furthermore, we show that, independent of system size and failure location, it is possible to handle failures almost up-to the user-specified bound in a manner that meets the required availability without introducing any inconsistency.