The Complexity of Synchronous Iterative Do-All with Crashes

Authors:
Chryssis Georgiou;Alexander Russell;Alexander A. Shvartsman
Affiliations:
-;-;-
Venue:
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Year:
2001

Citing 26
Cited 5

Efficient robust parallel computations

STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Combining tentative and definite executions for very fast dependable parallel computing

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Distributed computing: models and methods

Handbook of theoretical computer science (vol. B)
Achieving optimal CRCW PRAM fault-tolerance

Information Processing Letters
Efficient program transformations for resilient parallel computation via randomization (preliminary version)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Work-optimal asynchronous algorithms for shared memory parallel computers

SIAM Journal on Computing
On the complexity of certified write-all algorithms

Journal of Algorithms
Time-optimal message-efficient work performance in the presence of faults

PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Parallel algorithms with processor failures and delays

Journal of Algorithms
Algorithms for the Certified Write-All Problem

SIAM Journal on Computing
Performing Work Efficiently in the Presence of Faults

SIAM Journal on Computing
Fault-tolerant broadcasts and related problems

Distributed systems (2nd Ed.)
Reaching Agreement in the Presence of Faults

Journal of the ACM (JACM)
The Byzantine Generals Problem

ACM Transactions on Programming Languages and Systems (TOPLAS)
Fail-stop processors: an approach to designing fault-tolerant computing systems

ACM Transactions on Computer Systems (TOCS)
Fault-Tolerant Parallel Computation

Fault-Tolerant Parallel Computation
Distributed Cooperation During the Absence of Communication

DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Resolving message complexity of Byzantine Agreement and beyond

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Waitfree distributed memory management by Create, and Read Until Deletion (CRUD)

Waitfree distributed memory management by Create, and Read Until Deletion (CRUD)
Parallel processing on networks of workstations: a fault-tolerant, high performance approach

ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
An algorithm for the asynchronous Write-All problem based on process collision

Distributed Computing
Performing tasks on synchronous restartable message-passing processors

Distributed Computing
Efficient parallel algorithms can be made robust

Distributed Computing
Asynchronous PRAMs are (almost) as good as synchronous PRAMs

SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
Clock construction in fully asynchronous parallel systems and PRAM simulation

SFCS '92 Proceedings of the 33rd Annual Symposium on Foundations of Computer Science

Bounding Work and Communication in Robust Cooperative Computation

DISC '02 Proceedings of the 16th International Conference on Distributed Computing
distributed cooperation and adversity: complexity trade-offs

PCK50 Proceedings of the Paris C. Kanellakis memorial workshop on Principles of computing & knowledge: Paris C. Kanellakis memorial workshop on the occasion of his 50th birthday
The complexity of synchronous iterative Do-All with crashes

Distributed Computing
Dynamic load balancing with group communication

Theoretical Computer Science
Fast randomized test-and-set and renaming

DISC'10 Proceedings of the 24th international conference on Distributed computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Do-All is the problem of performing N tasks in a distributed system of P failure-prone processors [8]. Many distributed and parallel algorithms have been developed for this problem and several algorithm simulations have been developed by iterating Do-All algorithms. The efficiency of the solutions for Do-All is measured in terms of work complexity where all processing steps taken by the processors are counted. We present the first non-trivial lower bounds for Do-All that capture the dependence of work on N, P and f, the number of processor crashes. For the model of computation where processors are able to make perfect load-balancing decisions locally, we also present matching upper bounds. We define the r-iterative Do-All problem that abstracts the repeated use of Do-All such as found in algorithm simulations. Our f-sensitive analysis enables us to derive a tight bound for r-iterative Do-All work (that is stronger than the r-fold work complexity of a single Do-All). Our approach that models perfect load-balancing allows for the analysis of specific algorithms to be divided into two parts: (i) the analysis of the cost of tolerating failures while performing work, and (ii) the analysis of the cost of implementing load-balancing. We demonstrate the utility and generality of this approach by improving the analysis of two known efficient algorithms. Finally we present a new upper bound on simulations of synchronous shared-memory algorithms on crash-prone processors.