Time and message efficient reliable broadcasts
Proceedings of the 4th international workshop on Distributed algorithms
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Fault-tolerant broadcasts and related problems
Distributed systems (2nd Ed.)
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Message and time efficient consensus protocols for synchronous distributed systems
Journal of Parallel and Distributed Computing
Calibrating embedded protocols on asynchronous systems
Information Sciences: an International Journal
Hi-index | 0.00 |
An essential feature in any fault-tolerant design of distributed systems is a mechanism by which a process can reliably broadcast information to other processes in the presence of failures. This paper studies the message complexity of fault-tolerant broadcast protocols in weakly synchronous and totally asynchronous distributed systems with point-to-point communication links, where the system failures are caused by the processes but the communication links are completely reliable. We focus on the number of messages required of any fault-tolerant protocol in failure-free executions. Our motivation is that one should incur the cost of handling failures only when they actually occur. We present protocols that, in an $n$-process system subject to at most $t$ crash failures where $1 \leq t\,{\char'074}\,(n - 1)$, guarantee the delivery of a message from any process to other nonfaulty processes. In the absence of crash failures, our protocols require $(n + t - 1)$ messages in the weakly synchronous model and $(t + 1)(n - 1 - (t/2))$ messages in the totally asynchronous model. Moreover, we show that in both cases our protocols are optimal with respect to message complexity. The new insights provided in our lower bound proofs also yield graph-theoretic characterizations of all message-optimal reliable broadcast protocols in failure-free executions. Both the upper and lower bound results on broadcast protocols can be generalized to multicast protocols, where a process only needs to deliver a message to a subset of processes in the system.Index Terms驴 Reliable broadcasts/multicasts, distributed computing, network protocols, fault tolerance, message complexity.