The Do-All problem with Byzantine processor failures

Authors:
Antonio Fernández;Chryssis Georgiou;Alexander Russell;Alex A. Shvartsman
Affiliations:
GSyC, Universidad Rey Juan Carlos, Móstoles, Spain and University of Connecticut;Department of Computer Science, University of Cyprus, Nicosia, Cyprus;Department of Computer Science and Engineering, University of Connecticut, Storrs, CT;Department of Computer Science and Engineering, University of Connecticut, Storrs, CT and Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, ...
Venue:
Theoretical Computer Science - Foundations of software science and computation structures
Year:
2005

Citing 24
Cited 2

Efficient parallel algorithms can be made robust

Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
Combining tentative and definite executions for very fast dependable parallel computing

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Performing work efficiently in the presence of faults

PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
Work-optimal asynchronous algorithms for shared memory parallel computers

SIAM Journal on Computing
On the complexity of certified write-all algorithms

Journal of Algorithms
Time-optimal message-efficient work performance in the presence of faults

PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Parallel algorithms with processor failures and delays

Journal of Algorithms
Algorithms for the Certified Write-All Problem

SIAM Journal on Computing
Building agent teams using an explicit teamwork model and learning

Artificial Intelligence - Special issue on Robocop: the first step
The Byzantine Generals Problem

ACM Transactions on Programming Languages and Systems (TOPLAS)
Fail-stop processors: an approach to designing fault-tolerant computing systems

ACM Transactions on Computer Systems (TOCS)
SETI@HOME—massively distributed computing for SETI

Computing in Science and Engineering
Towards practical deteministic write-all algorithms

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Fault-Tolerant Parallel Computation

Fault-Tolerant Parallel Computation
Distributed Cooperation During the Absence of Communication

DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Bounding Work and Communication in Robust Cooperative Computation

DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Work-competitive scheduling for cooperative computing with dynamic groups

Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Small-World Topology for Multi-Agent Collaboration

DEXA '00 Proceedings of the 11th International Workshop on Database and Expert Systems Applications
Resolving message complexity of Byzantine Agreement and beyond

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Parallel processing on networks of workstations: a fault-tolerant, high performance approach

ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
Cooperative computing with fragmentable and mergeable groups

Journal of Discrete Algorithms
The complexity of synchronous iterative Do-All with crashes

Distributed Computing
An algorithm for the asynchronous Write-All problem based on process collision

Distributed Computing
Performing tasks on synchronous restartable message-passing processors

Distributed Computing

A robust randomized algorithm to perform independent tasks

Journal of Discrete Algorithms
Reliably executing tasks in the presence of malicious processors

DISC'05 Proceedings of the 19th international conference on Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models.This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.