The possibility and the complexity of achieving fault-tolerant coordination

  • Authors:
  • Rida Bazzi;Gil Neiger

  • Affiliations:
  • College of Computing, Georgia Institute of Technology, Atlanta, Georgia;College of Computing, Georgia Institute of Technology, Atlanta, Georgia

  • Venue:
  • PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of fault-tolerant coordination is fundamental in distributed computing. In the past, researchers have considered two types of coordination: general coordination, in which the actions of faulty processors are irrelevant, and consistent coordination, in which the faulty processors are forbidden from acting inconsistently. This paper studies the possibility and complexity of achieving coordination in synchronous and asynchronous systems with crash, send-omission, and general omission failures. We indicate the systems in which coordination cannot be achieved and, when it can, analyze the computational complexity of optimally achieving it. In some cases, optimum solutions can be implemented in polynomial time, while in others they require NP-hard local computation. These results provide a thorough characterization of coordination and will thus aid researchers in determining the approach to take when attempting to achieve fault-tolerant coordination.