Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems
Proceedings of the 13th International Symposium on Distributed Computing
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Eventually consistent failure detectors
Journal of Parallel and Distributed Computing
Implementing the Omega failure detector in the crash-recovery failure model
Journal of Computer and System Sciences
Crash-quiescent failure detection
DISC'09 Proceedings of the 23rd international conference on Distributed computing
Eventually perfect failure detectors using ADD channels
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
On the implementation of communication-optimal failure detectors
LADC'07 Proceedings of the Third Latin-American conference on Dependable Computing
Hi-index | 0.00 |
Several algorithms implementing failure detector classes $\diamondsuit\mathcal{Q}$ and $\diamondsuit\mathcal{P}$ have been proposed in the literature. The algorithm proposed by Chandra and Toueg in [2] uses a heartbeat mechanism and all-to-all communication to detect faulty processes. The algorithms proposed by Aguilera et al. in [1] and by Larrea et al. in [4] use heartbeats too, and rely on a leader-based approach. On the other hand, the algorithm proposed by Larrea et al. in [3] uses a polling —or query/reply— mechanism on a ring arrangement of processes. The leader-based and the ringbased algorithms are more e.cient than the all-to-all algorithm regarding the number of messages exchanged (linear vs. quadratic). Compared to polling, the heartbeat mechanism reduces the number of messages to the half. Therefore, a heartbeat and ring-based algorithm should outperform the former ones.