Performing Work Efficiently in the Presence of Faults

Authors:
Cynthia Dwork;Joseph Y. Halpern;Orli Waarts
Affiliations:
-;-;-
Venue:
SIAM Journal on Computing
Year:
1998

Citing 0
Cited 32

The do-all problem in broadcast networks

Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Optimal scheduling for disconnected cooperation

Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Gossiping to reach consensus

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Randomization Helps to Perform Tasks on Processors Prone to Failures

Proceedings of the 13th International Symposium on Distributed Computing
Distributed Cooperation During the Absence of Communication

DISC '00 Proceedings of the 14th International Conference on Distributed Computing
The Complexity of Synchronous Iterative Do-All with Crashes

DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Stable Leader Election

DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Bounding Work and Communication in Robust Cooperative Computation

DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Optimal F-Reliable Protocols for the Do-All Problem on Single-Hop Wireless Networks

ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
Performing work with asynchronous processors: message-delay-sensitive bounds

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Deterministic computations on a PRAM with static processor and memory faults

Fundamenta Informaticae
Cooperative computing with fragmentable and mergeable groups

Journal of Discrete Algorithms
Randomization helps to perform independent tasks reliably

Random Structures & Algorithms
Writing-all deterministically and optimally using a non-trivial number of asynchronous processors

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Task allocation in a multi-server system

Journal of Scheduling
Performing work with asynchronous processors: message-delay-sensitive bounds

Information and Computation
Efficient gossip and robust distributed computation

Theoretical Computer Science
Robust gossiping with an application to consensus

Journal of Computer and System Sciences
Writing-all deterministically and optimally using a nontrivial number of asynchronous processors

ACM Transactions on Algorithms (TALG)
A robust randomized algorithm to perform independent tasks

Journal of Discrete Algorithms
Fast scalable deterministic consensus for crash failures

Proceedings of the 28th ACM symposium on Principles of distributed computing
Performing work with asynchronous processors: Message-delay-sensitive bounds

Information and Computation
Locating and repairing faults in a network with mobile agents

Theoretical Computer Science
Emulating shared-memory Do-All algorithms in asynchronous message-passing systems

Journal of Parallel and Distributed Computing
Distributed agreement with optimal communication complexity

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Performing dynamically injected tasks on processes prone to crashes and restarts

DISC'11 Proceedings of the 25th international conference on Distributed computing
Asynchronous perfectly secure communication over one-time pads

ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
Time and communication efficient consensus for crash failures

DISC'06 Proceedings of the 20th international conference on Distributed Computing
Reliably executing tasks in the presence of malicious processors

DISC'05 Proceedings of the 19th international conference on Distributed Computing
Robust network supercomputing without centralized control

OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems
Deterministic Computations on a PRAM with Static Processor and Memory Faults

Fundamenta Informaticae
On the message complexity of indulgent consensus

DISC'07 Proceedings of the 21st international conference on Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a system of t synchronous processes that communicate only by sending messages to one another, and together the processes must perform n independent units of work. Processes may fail by crashing; we want to guarantee that in every execution of the protocol in which at least one process survives, all n units of work will be performed. We consider three parameters: the number of messages sent, the total number of units of work performed (including multiplicities), and time. We present three protocols for solving the problem. All three are work optimal, doing O(n+t) work. The first has moderate costs in the remaining two parameters, sends $O(t\sqrt{t})$ messages, and takes O(n+t) time. This protocol can be easily modified to run in any completely asynchronous system equipped with a failure detection mechanism. The second sends only O(t log t) messages, but its running time is large (O(t2(n+t) 2n+t)). The third is essentially time optimal in the (usual) case in which there are no failures, and its time complexity degrades gracefully as the number of failures increases.