Continuous Consensus with Failures and Recoveries

Authors:
Tal Mizrahi;Yoram Moses
Affiliations:
Department of Electrical Engineering, Technion, Haifa, Israel 32000;Department of Electrical Engineering, Technion, Haifa, Israel 32000
Venue:
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
Year:
2008

Citing 11
Cited 3

The distributed firing squad problem

SIAM Journal on Computing
Knowledge and common knowledge in a distributed environment

Journal of the ACM (JACM)
Knowledge and common knowledge in a byzantine environment: crash failures

Information and Computation
Using knowledge to optimally achieve coordination in distributed systems

Theoretical Computer Science
Reaching Agreement in the Presence of Faults

Journal of the ACM (JACM)
Time is Not a Healer

STACS '89 Proceedings of the 6th Annual Symposium on Theoretical Aspects of Computer Science
Stability of long-lived consensus

Journal of Computer and System Sciences
Reasoning About Knowledge

Reasoning About Knowledge
Common knowledge and consistent simultaneous coordination

Distributed Computing
Proactive recovery in a Byzantine-fault-tolerant system

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
'Eventual' is earlier than 'immediate'

SFCS '82 Proceedings of the 23rd Annual Symposium on Foundations of Computer Science

An Optimal Self-stabilizing Firing Squad

SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
An Optimal Self-Stabilizing Firing Squad

SIAM Journal on Computing
Early-deciding consensus is expensive

Proceedings of the 2013 ACM symposium on Principles of distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A continuous consensus(CC) protocol maintains for each process iat each time kan up-to-date core M_i[k] of information about the past, so that the cores at all processes are guaranteed to be identical. This is a generalization of simultaneous consensus that provides processes with the ability to perform simultaneously coordinated actions, and saves the need to compute multiple instances of simultaneous consensus at any given time. For an indefinite ongoing service of this type, it is somewhat unreasonable to assume a bound on the number of processes that ever fail. Moreover, over time, we can expect failed processes to be corrected. A failure assumption called (m,t) interval-bounded failures, closely related to the window of vulnerabilitymodel of Castro and Liskov, is considered for this type of service. The assumption is that in any given interval of mrounds, at most tprocesses can display faulty behavior.This paper presents an efficient CC protocol for the (m,t) bound in the crash and sending omissions failure models. A matching lower bound proof shows that the protocol is optimal in all runs (and not just in the worst case): For each and every behavior of the adversary, and at each time instant m, the core that our protocol maintains at time mis a superset of the core maintained by any other correct CC protocol under the same adversary. The lower bound is a significant generalization of previous proofs for common knowledge, and it applies to continuous consensus in a wide class of benign failure models, including the general omissions model, for which no similar proof existed.