Extracting quorum failure detectors

Authors:
Vibhor Bhatt;Nicholas Christman;Prasad Jayanti
Affiliations:
Dartmouth College, Hanover, NH, USA;McKinsey & Company, Boston, MA, USA;Dartmouth College, Hanover, NH, USA
Venue:
Proceedings of the 28th ACM symposium on Principles of distributed computing
Year:
2009

Citing 10
Cited 2

Agreement is harder than consensus: set consensus problems in totally asynchronous systems

PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
The weakest failure detector for solving consensus

Journal of the ACM (JACM)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Solution of a problem in concurrent programming control

Communications of the ACM
The weakest failure detectors to solve certain fundamental problems in distributed computing

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Mutual exclusion in asynchronous systems with failure detectors

Journal of Parallel and Distributed Computing
The weakest failure detector to solve nonuniform consensus

Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing
Every problem has a weakest failure detector

Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing

On the existence of weakest failure detectors for mutual exclusion and k-exclusion

DISC'09 Proceedings of the 23rd international conference on Distributed computing
Asynchronous failure detectors

PODC '12 Proceedings of the 2012 ACM symposium on Principles of distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is well known that the failure detector Ω is necessary and sufficient to solve consensus in asynchronous message passing systems where a majority of processes is guaranteed to be correct [1,2]. But what if the problem were to be solved in an arbitrary environment where any number of processes may fail and at any times? The answer was provided by Delporte et al who showed that a certain quorum failure detector, which they called Σ, is necessary and, together with Ω, sufficient to solve consensus in any environment [4]. Moving beyond consensus and such specific problems, we pose and answer the following general question: Given an arbitrary decision problem P, is it possible to identify a quorum failure detector that is provably necessary to solve P in any environment? We answer this question in the affirmative: we present a universal quorum failure detector Q that, when instantiated with any decision problem P, yields a quorum failure detector that is necessary to solve P in any environment. We demonstrate the power of this universal detector by showing that several existing quorum failure detectors, known to be necessary and sufficient to solve some well-known problems, are obtained by instantiating Q with those problems. In particular, the quorum failure detector Σ proposed for consensus [4], Σv proposed for nonuniform consensus [8], and the loneliness detector for (n−1)-set agreement [6] are all obtained by instantiating the universal detector Q with the respective problems. Besides yielding existing failure detectors, Q has led to a new and useful detector: while mutual exclusion can be solved using the Trusting failure detector T when a majority of processes is correct [5], it is not known how to solve this problem in an arbitrary environment. We show that the quorum failure detector Σl, obtained by instantiating Q with the mutual exclusion problem, together with T, is sufficient to solve mutual exclusion in any environment. The necessity of Σl to solve mutual exclusion is implied by our general theorem. Our proof of the universality of Q employs a different technique than the common reduction technique introduced by Chandra et al [1], where processes build an infinite DAG of sampled failure detector values and use it for locally simulating runs of the algorithm.