On the Quality of Service of Failure Detectors
IEEE Transactions on Computers
On the Quality of Service of Failure Detectors
IEEE Transactions on Computers
An introduction to oracles for asynchronous distributed systems
Future Generation Computer Systems - Parallel computing technologies (PaCT-2001)
Quiescent Uniform Reliable Broadcast as an Introduction to Failure Detector Oracles
PaCT '01 Proceedings of the 6th International Conference on Parallel Computing Technologies
The Lord of the Rings: Efficient Maintenance of Views at Data Warehouses
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Total order broadcast and multicast algorithms: Taxonomy and survey
ACM Computing Surveys (CSUR)
Quiescent consensus in mobile ad-hoc networks using eventually storage-free broadcasts
Proceedings of the 2006 ACM symposium on Applied computing
Construction of a fault-tolerant wireless communication topology using distributed agreement
DIWANS '06 Proceedings of the 2006 workshop on Dependability issues in wireless ad hoc networks and sensor networks
On the Respective Power of ◊P and ◊S to Solve One-Shot Agreement Problems
IEEE Transactions on Parallel and Distributed Systems
Semi-passive replication and Lazy Consensus
Journal of Parallel and Distributed Computing
Crash-quiescent failure detection
DISC'09 Proceedings of the 23rd international conference on Distributed computing
The failure detector abstraction
ACM Computing Surveys (CSUR)
Proceedings of the 11th IFIP WG 6.1 international conference on Distributed applications and interoperable systems
Eventually perfect failure detectors using ADD channels
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.01 |
We study the problem of achieving reliable communication with quiescent algorithms (i.e., algorithms that eventually stop sending messages) in asynchronous systems with process crashes and lossy links. We first show that it is impossible to solve this problem in asynchronous systems (with no failure detectors). We then show that, among failure detectors that output lists of suspects, the weakest one that can be used to solve this problem is $\diamond \cal P,$ a failure detector that cannot be implemented. To overcome this difficulty, we introduce an implementable failure detector called Heartbeat and show that it can be used to achieve quiescent reliable communication. Heartbeat is novel: in contrast to typical failure detectors, it does not output lists of suspects and it is implementable without timeouts. With Heartbeat, many existing algorithms that tolerate only process crashes can be transformed into quiescent algorithms that tolerate both process crashes and message losses. This can be applied to consensus, atomic broadcast, k-set agreement, atomic commitment, etc.