The possibility and the complexity of achieving fault-tolerant coordination

Authors:
Rida Bazzi;Gil Neiger
Affiliations:
College of Computing, Georgia Institute of Technology, Atlanta, Georgia;College of Computing, Georgia Institute of Technology, Atlanta, Georgia
Venue:
PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
Year:
1992

Citing 13
Cited 3

Knowledge and implicit knowledge in a distributed environment

Proceedings of the 1986 Conference on Theoretical aspects of reasoning about knowledge
A knowledge-theoretic analysis of atomic commitment protocols

PODS '87 Proceedings of the sixth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Knowledge and common knowledge in a distributed environment

Journal of the ACM (JACM)
Automatically increasing the fault-tolerance of distributed algorithms

Journal of Algorithms
A characterization of eventual Byzantine agreement

PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Knowledge and common knowledge in a byzantine environment: crash failures

Information and Computation
Common knowledge and consistent simultaneous coordination

Proceedings of the 4th international workshop on Distributed algorithms
Using knowledge to optimally achieve coordination is distributed systems

TARK '92 Proceedings of the fourth conference on Theoretical aspects of reasoning about knowledge
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
A Knowledge Theoretic Account of Recovery in Distributed Systems: The Case of Negotiated Commitment

Proceedings of the 2nd Conference on Theoretical Aspects of Reasoning about Knowledge
Reliable Broadcast in Synchronous and Asynchronous Environments (Preliminary Version)

Proceedings of the 3rd International Workshop on Distributed Algorithms
Knowledge in distributed byzantine environments

Knowledge in distributed byzantine environments

Simplifying fault-tolerance: providing the abstraction of crash failures

Journal of the ACM (JACM)
An Optimal Self-stabilizing Firing Squad

SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
An Optimal Self-Stabilizing Firing Squad

SIAM Journal on Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of fault-tolerant coordination is fundamental in distributed computing. In the past, researchers have considered two types of coordination: general coordination, in which the actions of faulty processors are irrelevant, and consistent coordination, in which the faulty processors are forbidden from acting inconsistently. This paper studies the possibility and complexity of achieving coordination in synchronous and asynchronous systems with crash, send-omission, and general omission failures. We indicate the systems in which coordination cannot be achieved and, when it can, analyze the computational complexity of optimally achieving it. In some cases, optimum solutions can be implemented in polynomial time, while in others they require NP-hard local computation. These results provide a thorough characterization of coordination and will thus aid researchers in determining the approach to take when attempting to achieve fault-tolerant coordination.