Fault-tolerant stream processing using a distributed, replicated file system

Authors:
YongChul Kwon;Magdalena Balazinska;Albert Greenberg
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA;Microsoft Research, Redmond, WA
Venue:
Proceedings of the VLDB Endowment
Year:
2008

Citing 32
Cited 8

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
Implementing fault-tolerant services using the state machine approach: a tutorial

ACM Computing Surveys (CSUR)
The O2 system

Communications of the ACM
The ObjectStore database system

Communications of the ACM
ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging

ACM Transactions on Database Systems (TODS)
Network flows: theory, algorithms, and applications

Network flows: theory, algorithms, and applications
Efficient checkpointing on MIMD architectures

Efficient checkpointing on MIMD architectures
Shoring up persistent applications

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Application level fault tolerance in heterogeneous networks of workstations

Journal of Parallel and Distributed Computing
Staggered Consistent Checkpointing

IEEE Transactions on Parallel and Distributed Systems
A survey of rollback-recovery protocols in message-passing systems

ACM Computing Surveys (CSUR)
QuickStore: a high performance mapped object store

The VLDB Journal — The International Journal on Very Large Data Bases
Main Memory Database Systems: An Overview

IEEE Transactions on Knowledge and Data Engineering
Incremental Recovery in Main Memory Database Systems

IEEE Transactions on Knowledge and Data Engineering
Low-Latency, Concurrent Checkpointing for Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
Differential Logging: A Commutative and Associative Logging Scheme for Highly Parallel Main Memory Databases

Proceedings of the 17th International Conference on Data Engineering
A Study of Index Structures for Main Memory Database Management Systems

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Design, Implementation, and Performance of Checkpointing in NetSolve

DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Automated application-level checkpointing of MPI programs

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
MigThread: Thread Migration in DSM Systems

ICPPW '02 Proceedings of the 2002 International Conference on Parallel Processing Workshops
Gigascope: a stream database for network applications

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Highly available, fault-tolerant, parallel dataflows

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
High-Availability Algorithms for Distributed Stream Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Fault-tolerance in the Borealis distributed stream processing system

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Libckpt: transparent checkpointing under Unix

TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Linear hashing: a new tool for file and table addressing

VLDB '80 Proceedings of the sixth international conference on Very Large Data Bases - Volume 6
Dynamo: amazon's highly available key-value store

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Performance evaluation of the striped checkpointing algorithm on the distributed RAID for cluster computer

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartII
Application-Level checkpointing techniques for parallel programs

ICDCIT'06 Proceedings of the Third international conference on Distributed Computing and Internet Technology

Exploitation of backup nodes for reducing recovery cost in high availability stream processing systems

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
iFlow: an approach for fast and reliable Internet-scale stream processing utilizing detouring and replication

Proceedings of the VLDB Endowment
A latency and fault-tolerance optimizer for online parallel query plans

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Fault injection-based assessment of partial fault tolerance in stream processing applications

Proceedings of the 5th ACM international conference on Distributed event-based system
Improving Bandwidth Efficiency for Consistent Multistream Storage

ACM Transactions on Storage (TOS)
Pollux: towards scalable distributed real-time search on microblogs

Proceedings of the 16th International Conference on Extending Database Technology
Rollback-recovery without checkpoints in distributed event processing systems

Proceedings of the 7th ACM international conference on Distributed event-based systems
MillWheel: fault-tolerant stream processing at internet scale

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present SGuard, a new fault-tolerance technique for distributed stream processing engines (SPEs) running in clusters of commodity servers. SGuard is less disruptive to normal stream processing and leaves more resources available for normal stream processing than previous proposals. Like several previous schemes, SGuard is based on rollback recovery [18]: it checkpoints the state of stream processing nodes periodically and restarts failed nodes from their most recent checkpoints. In contrast to previous proposals, however, SGuard performs checkpoints asynchronously: i.e., operators continue processing streams during the checkpoint thus reducing the potential disruption due to the checkpointing activity. Additionally, SGuard saves the checkpointed state into a new type of distributed and replicated file system (DFS) such as GFS [22] or HDFS [9], leaving more memory resources available for normal stream processing. To manage resource contention due to simultaneous checkpoints by different SPE nodes, SGuard adds a scheduler to the DFS. This scheduler coordinates large batches of write requests in a manner that reduces individual checkpoint times while maintaining good overall resource utilization. We demonstrate the effectiveness of the approach through measurements of a prototype implementation in the Borealis [2] open-source SPE using HDFS [9] as the DFS.