The do-all problem in broadcast networks
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Optimal scheduling for disconnected cooperation
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Randomization Helps to Perform Tasks on Processors Prone to Failures
Proceedings of the 13th International Symposium on Distributed Computing
Distributed Cooperation During the Absence of Communication
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
The Complexity of Synchronous Iterative Do-All with Crashes
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Bounding Work and Communication in Robust Cooperative Computation
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Optimal F-Reliable Protocols for the Do-All Problem on Single-Hop Wireless Networks
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
Performing work with asynchronous processors: message-delay-sensitive bounds
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Deterministic computations on a PRAM with static processor and memory faults
Fundamenta Informaticae
Cooperative computing with fragmentable and mergeable groups
Journal of Discrete Algorithms
Randomization helps to perform independent tasks reliably
Random Structures & Algorithms
Writing-all deterministically and optimally using a non-trivial number of asynchronous processors
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Task allocation in a multi-server system
Journal of Scheduling
Performing work with asynchronous processors: message-delay-sensitive bounds
Information and Computation
Efficient gossip and robust distributed computation
Theoretical Computer Science
Robust gossiping with an application to consensus
Journal of Computer and System Sciences
Writing-all deterministically and optimally using a nontrivial number of asynchronous processors
ACM Transactions on Algorithms (TALG)
A robust randomized algorithm to perform independent tasks
Journal of Discrete Algorithms
Fast scalable deterministic consensus for crash failures
Proceedings of the 28th ACM symposium on Principles of distributed computing
Performing work with asynchronous processors: Message-delay-sensitive bounds
Information and Computation
Locating and repairing faults in a network with mobile agents
Theoretical Computer Science
Emulating shared-memory Do-All algorithms in asynchronous message-passing systems
Journal of Parallel and Distributed Computing
Distributed agreement with optimal communication complexity
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Performing dynamically injected tasks on processes prone to crashes and restarts
DISC'11 Proceedings of the 25th international conference on Distributed computing
Asynchronous perfectly secure communication over one-time pads
ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
Time and communication efficient consensus for crash failures
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Reliably executing tasks in the presence of malicious processors
DISC'05 Proceedings of the 19th international conference on Distributed Computing
Robust network supercomputing without centralized control
OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems
Deterministic Computations on a PRAM with Static Processor and Memory Faults
Fundamenta Informaticae
On the message complexity of indulgent consensus
DISC'07 Proceedings of the 21st international conference on Distributed Computing
Hi-index | 0.00 |
We consider a system of t synchronous processes that communicate only by sending messages to one another, and together the processes must perform n independent units of work. Processes may fail by crashing; we want to guarantee that in every execution of the protocol in which at least one process survives, all n units of work will be performed. We consider three parameters: the number of messages sent, the total number of units of work performed (including multiplicities), and time. We present three protocols for solving the problem. All three are work optimal, doing O(n+t) work. The first has moderate costs in the remaining two parameters, sends $O(t\sqrt{t})$ messages, and takes O(n+t) time. This protocol can be easily modified to run in any completely asynchronous system equipped with a failure detection mechanism. The second sends only O(t log t) messages, but its running time is large (O(t2(n+t) 2n+t)). The third is essentially time optimal in the (usual) case in which there are no failures, and its time complexity degrades gracefully as the number of failures increases.