Efficient robust parallel computations
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Combining tentative and definite executions for very fast dependable parallel computing
STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Distributed computing: models and methods
Handbook of theoretical computer science (vol. B)
Achieving optimal CRCW PRAM fault-tolerance
Information Processing Letters
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Work-optimal asynchronous algorithms for shared memory parallel computers
SIAM Journal on Computing
On the complexity of certified write-all algorithms
Journal of Algorithms
Time-optimal message-efficient work performance in the presence of faults
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Parallel algorithms with processor failures and delays
Journal of Algorithms
Algorithms for the Certified Write-All Problem
SIAM Journal on Computing
Performing Work Efficiently in the Presence of Faults
SIAM Journal on Computing
Fault-tolerant broadcasts and related problems
Distributed systems (2nd Ed.)
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Fault-Tolerant Parallel Computation
Fault-Tolerant Parallel Computation
Distributed Cooperation During the Absence of Communication
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Resolving message complexity of Byzantine Agreement and beyond
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Waitfree distributed memory management by Create, and Read Until Deletion (CRUD)
Waitfree distributed memory management by Create, and Read Until Deletion (CRUD)
Parallel processing on networks of workstations: a fault-tolerant, high performance approach
ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
An algorithm for the asynchronous Write-All problem based on process collision
Distributed Computing
Performing tasks on synchronous restartable message-passing processors
Distributed Computing
Efficient parallel algorithms can be made robust
Distributed Computing
Asynchronous PRAMs are (almost) as good as synchronous PRAMs
SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
Clock construction in fully asynchronous parallel systems and PRAM simulation
SFCS '92 Proceedings of the 33rd Annual Symposium on Foundations of Computer Science
Bounding Work and Communication in Robust Cooperative Computation
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
distributed cooperation and adversity: complexity trade-offs
PCK50 Proceedings of the Paris C. Kanellakis memorial workshop on Principles of computing & knowledge: Paris C. Kanellakis memorial workshop on the occasion of his 50th birthday
The complexity of synchronous iterative Do-All with crashes
Distributed Computing
Dynamic load balancing with group communication
Theoretical Computer Science
Fast randomized test-and-set and renaming
DISC'10 Proceedings of the 24th international conference on Distributed computing
Hi-index | 0.01 |
Do-All is the problem of performing N tasks in a distributed system of P failure-prone processors [8]. Many distributed and parallel algorithms have been developed for this problem and several algorithm simulations have been developed by iterating Do-All algorithms. The efficiency of the solutions for Do-All is measured in terms of work complexity where all processing steps taken by the processors are counted. We present the first non-trivial lower bounds for Do-All that capture the dependence of work on N, P and f, the number of processor crashes. For the model of computation where processors are able to make perfect load-balancing decisions locally, we also present matching upper bounds. We define the r-iterative Do-All problem that abstracts the repeated use of Do-All such as found in algorithm simulations. Our f-sensitive analysis enables us to derive a tight bound for r-iterative Do-All work (that is stronger than the r-fold work complexity of a single Do-All). Our approach that models perfect load-balancing allows for the analysis of specific algorithms to be divided into two parts: (i) the analysis of the cost of tolerating failures while performing work, and (ii) the analysis of the cost of implementing load-balancing. We demonstrate the utility and generality of this approach by improving the analysis of two known efficient algorithms. Finally we present a new upper bound on simulations of synchronous shared-memory algorithms on crash-prone processors.