Distributed agreement in the presence of processor and communication faults
IEEE Transactions on Software Engineering
A communication-efficient canonical form for fault-tolerant distributed protocols
PODC '86 Proceedings of the fifth annual ACM symposium on Principles of distributed computing
Achieving consensus in fault-tolerant distributed computer systems: protocols, lower bounds, and simulations
The distributed firing squad problem
SIAM Journal on Computing
Knowledge and common knowledge in a distributed environment
Journal of the ACM (JACM)
Automatically increasing the fault-tolerance of distributed algorithms
Journal of Algorithms
A characterization of eventual Byzantine agreement
PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Early stopping in Byzantine agreement
Journal of the ACM (JACM)
Knowledge and common knowledge in a byzantine environment: crash failures
Information and Computation
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Reliable Broadcast in Synchronous and Asynchronous Environments (Preliminary Version)
Proceedings of the 3rd International Workshop on Distributed Algorithms
Notes on Data Base Operating Systems
Operating Systems, An Advanced Course
PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
Issues of fault tolerance in concurrent computations (databases, reliability, transactions, agreement protocols, distributed computing)
Using knowledge to optimally achieve coordination in distributed systems: extended abstract
TARK '92 Proceedings of the 4th conference on Theoretical aspects of reasoning about knowledge
Unifying self-stabilization and fault-tolerance
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Simplifying fault-tolerance: providing the abstraction of crash failures
Journal of the ACM (JACM)
Implementing knowledge-based programs
TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Compositional competitiveness for distributed algorithms
Journal of Algorithms
Continuous Consensus with Failures and Recoveries
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
An Optimal Self-stabilizing Firing Squad
SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Compositional competitiveness for distributed algorithms
Journal of Algorithms
Optimum simultaneous consensus for general omissions is equivalent to an NP oracle
DISC'09 Proceedings of the 23rd international conference on Distributed computing
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
An Optimal Self-Stabilizing Firing Squad
SIAM Journal on Computing
Hi-index | 0.00 |
There is a very close relationship between common knowledge and simultaneity in synchronous distributed systems. The analysis of several well-known problems in terms of common knowledge has led to round-optimal protocols for these problems, including Reliable Broadcast, Distributed Consensus, and the Distributed Firing Squad problem. These problems require that the correct processors coordinate their actions in some way but place no restrictions on the behaviour of the faulty processors. In systems with benign processor failures, however, it is reasonable to require that the actions of a faulty processor be consistent with those of the correct processors, assuming it performs any action at all. We consider problems requiring consistent, simultaneous coordination. We then analyze these problems in terms of common knowledge in several failure models. The analysis of these stronger problems requires a stronger definition of common knowledge, and we study the relationship between these two definitions. In many cases, the two definitions are actually equivalent, and simple modifications of previous solutions yield round-optimal solutions to these problems. When the definitions differ, however, we show that such problems cannot be solved, even in failure-free executions.