On the minimal synchronism needed for distributed consensus
Journal of the ACM (JACM)
Consensus in the presence of partial synchrony
Journal of the ACM (JACM)
Using process groups to implement failure detection in asynchronous environments
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
The weakest failure detector for solving consensus
Journal of the ACM (JACM)
On the impossibility of group membership
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Dynamic voting for consistent primary components
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
Replication management using the state-machine approach
Distributed systems (2nd Ed.)
Distributed systems (2nd Ed.)
The Timed Asynchronous Distributed System Model
IEEE Transactions on Parallel and Distributed Systems
ACM Transactions on Computer Systems (TOCS)
An Internet multicast system for the stock market
ACM Transactions on Computer Systems (TOCS)
Group communication specifications: a comprehensive study
ACM Computing Surveys (CSUR)
Distributed Algorithms
Reliable Distributed Computing with the ISIS Toolkit
Reliable Distributed Computing with the ISIS Toolkit
Solving Agreement Problems with Weak Ordering Oracles
EDCC-4 Proceedings of the 4th European Dependable Computing Conference on Dependable Computing
Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols
PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
View Synchronous Communication in Large Scale Networks
View Synchronous Communication in Large Scale Networks
Early consensus in an asynchronous system with a weak failure detector
Distributed Computing
SFCS '83 Proceedings of the 24th Annual Symposium on Foundations of Computer Science
Semi-passive replication and Lazy Consensus
Journal of Parallel and Distributed Computing
Consistent Partial Model Checking
Electronic Notes in Theoretical Computer Science (ENTCS)
Hi-index | 0.00 |
Failure detection and group membership are two important components of fault-tolerant distributed systems. Understanding their role is essential when developing efficient solutions, not only in failure-free runs, but also in runs in which processes do crash. While group membership provides consistent information about the status of processes in the system, failure detectors provide inconsistent information. This paper discusses the trade-offs related to the use of these two components, and clarifies their roles using three examples. The first example shows a case where group membership may favourably be replaced by a failure detection mechanism. The second example illustrates a case where group membership is mandatory. Finally, the third example shows a case where neither group membership nor failure detectors are needed (they may be replaced by weak ordering oracles).