On the minimal synchronism needed for distributed consensus
Journal of the ACM (JACM)
Concurrency control and recovery in database systems
Concurrency control and recovery in database systems
Consensus in the presence of partial synchrony
Journal of the ACM (JACM)
Parallel program design: a foundation
Parallel program design: a foundation
Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems
PODC '88 Proceedings of the seventh annual ACM Symposium on Principles of distributed computing
Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Renaming in an asynchronous environment
Journal of the ACM (JACM)
Agreement is harder than consensus: set consensus problems in totally asynchronous systems
PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
The consensus problem in fault-tolerant computing
ACM Computing Surveys (CSUR)
Generalized FLP impossibility result for t-resilient asynchronous computations
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Sharing memory robustly in message-passing systems
Journal of the ACM (JACM)
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Failure detectors and the wait-free hierarchy (extended abstract)
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
The weakest failure detector for solving consensus
Journal of the ACM (JACM)
Computer networks (3rd ed.)
Failure detectors in omission failure environments
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
Round-by-round fault detectors (extended abstract): unifying synchrony and asynchrony
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
ACM Transactions on Computer Systems (TOCS)
Information Processing Letters
What good are models and what models are good?
Distributed systems (2nd Ed.)
Failure Detection and Randomization: A Hybrid Approach to Solve Consensus
SIAM Journal on Computing
The Timed Asynchronous Distributed System Model
IEEE Transactions on Parallel and Distributed Systems
Theoretical Computer Science
The topological structure of asynchronous computability
Journal of the ACM (JACM)
Self-stabilization
Indulgent algorithms (preliminary version)
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Wait-Free k-Set Agreement is Impossible: The Topology of Public Knowledge
SIAM Journal on Computing
On Quiescent Reliable Communication
SIAM Journal on Computing
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Self-stabilizing systems in spite of distributed control
Communications of the ACM
Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems
IEEE Transactions on Computers
Muteness Failure Detectors: Specification and Implementation
EDCC-3 Proceedings of the Third European Dependable Computing Conference on Dependable Computing
Using Failure Detectors to Solve Consensus in Asynchronous Sharde-Memory Systems (Extended Abstract)
WDAG '94 Proceedings of the 8th International Workshop on Distributed Algorithms
"Gamma-Accurate" Failure Detectors
WDAG '96 Proceedings of the 10th International Workshop on Distributed Algorithms
WDAG '97 Proceedings of the 11th International Workshop on Distributed Algorithms
Failure Detection and Consensus in the Crash-Recovery Model
DISC '98 Proceedings of the 12th International Symposium on Distributed Computing
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
On the Impact of Fast Failure Detectors on Real-Time Fault-Tolerant Systems
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Encapsulating Failure Detection: From Crash to Byzantine Failures
Ada-Europe '02 Proceedings of the 7th Ada-Europe International Conference on Reliable Software Technologies
Implementable Failure Detectors in Asynchronous Systems
Proceedings of the 18th Conference on Foundations of Software Technology and Theoretical Computer Science
(Im)Possibilities of Predicate Detection in Crash-Affected Systems
WSS '01 Proceedings of the 5th International Workshop on Self-Stabilizing Systems
Consensus in Asynchronous Distributed Systems: A Concise Guided Tour
Advances in Distributed Systems, Advanced Distributed Computing: From Algorithms to Systems
Synchronous System and Perfect Failure Detector: Solvability and Efficiency Issue
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
On the Quality of Service of Failure Detectors
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Proceedings of the 13th International Symposium on Distributed Computing
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Unreliable Intrusion Detection in Distributed Computations
CSFW '97 Proceedings of the 10th IEEE workshop on Computer Security Foundations
Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols
PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
Consensus in Synchronous Systems: A Concise Guided Tour
PRDC '02 Proceedings of the 2002 Pacific Rim International Symposium on Dependable Computing
SAINT-W '02 Proceedings of the 2002 Symposium on Applications and the Internet (SAINT) Workshops
Consensus in Asynchronous Systems Where Processes Can Crash and Recover
SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
Optimal Implementation of the Weakest Failure Detector for Solving Consensus
SRDS '00 Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems
Consistent Detection of Global Predicates under a Weak Fault Assumption
SRDS '00 Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems
Distributed Predicate Detection in a Faulty Environment
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Detectors and Correctors: A Theory of Fault-Tolerance Components
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
Election Vs. Consensus in Asynchronous Systems
Election Vs. Consensus in Asynchronous Systems
On implementing omega with weak reliability and synchrony assumptions
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Non-blocking atomic commit in asynchronous distributed systems with failure detectors
Distributed Computing
Distributed Computing: Fundamentals, Simulations and Advanced Topics
Distributed Computing: Fundamentals, Simulations and Advanced Topics
The weakest failure detectors to solve certain fundamental problems in distributed computing
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Failure detection and consensus in the crash-recovery model
Distributed Computing
A simple and fast asynchronous consensus protocol based on a weak failure detector
Distributed Computing
Early consensus in an asynchronous system with a weak failure detector
Distributed Computing
Erratum: early consensus in an asynchronous system with a weak failure detector
Distributed Computing
Mutual exclusion in asynchronous systems with failure detectors
Journal of Parallel and Distributed Computing
On the Possibility of Consensus in Asynchronous Systems with Finite Average Response Times
ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Distributed Computing
Illustrating the impossibility of crash-tolerant consensus in asynchronous systems
ACM SIGOPS Operating Systems Review
On the weakest failure detector ever
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
PeerReview: practical accountability for distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
The gap in circumventing the impossibility of consensus
Journal of Computer and System Sciences
Atomic shared register access by asynchronous hardware
SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
Anti-Ω: the weakest failure detector for set agreement
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Failure detectors in loosely named systems
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Every problem has a weakest failure detector
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
The Weakest Failure Detector for Message Passing Set-Agreement
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
In search of the holy grail: looking for the weakest failure detector for wait-free set agreement
OPODIS'06 Proceedings of the 10th international conference on Principles of Distributed Systems
On the possibility and the impossibility of message-driven self-stabilizing failure detection
SSS'05 Proceedings of the 7th international conference on Self-Stabilizing Systems
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Revisiting failure detection and consensus in omission failure environments
ICTAC'05 Proceedings of the Second international conference on Theoretical Aspects of Computing
On conspiracies and hyperfairness in distributed computing
DISC'05 Proceedings of the 19th international conference on Distributed Computing
Efficient reduction for wait-free termination detection in a crash-prone distributed system
DISC'05 Proceedings of the 19th international conference on Distributed Computing
Implementing reliable distributed real-time systems with the Θ-model
OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Automatic classification of eventual failure detectors
DISC'07 Proceedings of the 21st international conference on Distributed Computing
Failure detection in a RESTful way
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Hi-index | 0.00 |
A failure detector is a fundamental abstraction in distributed computing. This article surveys this abstraction through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. In particular, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlight some limitations of the failure detector abstraction along each of the dimensions.