Towards practical deteministic write-all algorithms

Authors:
Bogdan S. Chlebus;Stefan Dobrev;Dariusz R. Kowalski;Grzegorz Malewicz;Alex Shvartsman;Imrich Vrto
Affiliations:
Instytut Informatyki, Uniwersytet Warsza wski, Banacha 2, Warszawa 02-097, Poland;School of Computer Science, Carleton University, Canada;Instytut Informatyki, Uniwersytet Warszawski, Banacha 2, Warszawa 02-097, Poland;Department of Computer Science and Engineering, 191 Auditorium Rd., Unit 155, University of Connecticut, Storrs, CT;Department of Computer Science and Engineering, 191 Auditorium Rd., Unit 155, University of Connecticut, Storrs, CT and Laboratory for Computer Science, Massachusetts Institute of Technology, Camb ...;Slovak Academy of Sciences, Bratislava, Slo vakia
Venue:
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Year:
2001

Citing 28
Cited 11

Parallel algorithmic techniques for combinatorial computation

Annual review of computer science: vol. 3, 1988
Towards an architecture-independent analysis of parallel algorithms

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Deterministic P-RAM simulation with constant redundancy

SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
A more practical PRAM model

SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
The APRAM: incorporating asynchrony into the PRAM model

SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Efficient parallel algorithms can be made robust

Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
A bridging model for parallel computation

Communications of the ACM
Asynchronous shared memory parallel computation

SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
The expected advantage of asynchrony

SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Efficient robust parallel computations

STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Parallel algorithms for shared-memory machines

Handbook of theoretical computer science (vol. A)
General purpose parallel architectures

Handbook of theoretical computer science (vol. A)
Achieving optimal CRCW PRAM fault-tolerance

Information Processing Letters
Efficient program transformations for resilient parallel computation via randomization (preliminary version)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Work-optimal asynchronous algorithms for shared memory parallel computers

SIAM Journal on Computing
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Lectures on parallel computation

Lectures on parallel computation
Parallel algorithms with processor failures and delays

Journal of Algorithms
Constructions of permutation arrays for certain scheduling cost measures

Random Structures & Algorithms
Algorithms for the Certified Write-All Problem

SIAM Journal on Computing
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms

The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
“Dynamic-fault-prone BSP”: a paradigm for robust computations in changing environments

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Fault-Tolerant Parallel Computation

Fault-Tolerant Parallel Computation
Parallelism in random access machines

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Waitfree distributed memory management by Create, and Read Until Deletion (CRUD)

Waitfree distributed memory management by Create, and Read Until Deletion (CRUD)
Parallel processing on networks of workstations: a fault-tolerant, high performance approach

ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
An algorithm for the asynchronous Write-All problem based on process collision

Distributed Computing

The do-all problem in broadcast networks

Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
A work-optimal deterministic algorithm for the asynchronous certified write-all problem

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Deterministic computations on a PRAM with static processor and memory faults

Fundamenta Informaticae
Collective asynchronous reading with polylogarithmic worst-case overhead

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Writing-all deterministically and optimally using a non-trivial number of asynchronous processors

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Cooperative asynchronous update of shared memory

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
A tight analysis and near-optimal instances of the algorithm of Anderson and Woll

Theoretical Computer Science
The Do-All problem with Byzantine processor failures

Theoretical Computer Science - Foundations of software science and computation structures
Writing-all deterministically and optimally using a nontrivial number of asynchronous processors

ACM Transactions on Algorithms (TALG)
Emulating shared-memory Do-All algorithms in asynchronous message-passing systems

Journal of Parallel and Distributed Computing
Deterministic Computations on a PRAM with Static Processor and Memory Faults

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of performing t tasks on n asynchronous or undependable processors is a basic problem in parallel and distributed computing. We consider an abstraction of this problem called the Write-All problem— using n processors write 1's into all locations of an array of size t. The most efficient known deterministic asynchronous algorithms for this problem are due to Anderson and Woll. The first class of algorithms has work complexity of &Ogr;(t . n &egr;), for n ≰ ty and any &egr; 0, and they are the best known for the full range of processors (n = t). To schedule the work of the processors, the algorithms use sets of q permutations on [q] (q ≰ n) that have certain combinatorial properties. Instantiating such an algorithm for a specific &egr; either requires substantial pre-processing (exponential in 1/&egr;2) to find the requisite permutations, or imposes a prohibitive constant (exponential in 1/&egr;3) hidden by the asymptotic analysis. The second class deals with the specific case of t = nu, u ≰ 2, and these algorithms have work complexity of &Ogr;(t log t). They also use sets of permutations with the same combinatorial properties. However instantiating these algorithms requires exponential in n preprocessing to find the permutations. To alleviate this costly instantiation Kanellakis and Shvartsman proposed a simple way of computing the permutation schedules. They conjectured that their construction has the desired properties but they provided no analysis.In this paper we show, for the first time, an analysis of the properties of the set of permutations proposed by Kanellakis and Shvartsman. Our result is hybrid as it includes analytical and empirical parts. The analytical result covers a subset of the possible adversarial patterns of asynchrony. The empirical results provide strong evidence that our analysis covers the worst case scenario, and we formally state it as a conjecture. We use these results to analyze an algorithm for t = nu (u ⪈ 2), tasks, that takes advantage of processor slackness and that has work &Ogr;(t log2 t), conditioned on our conjecture. This algorithm requires only &Ogr;(n log n) time to instantiate it. Next we study the case for the full range of processors n ≰ t. We define a family of deterministic asynchronous Write-All algorithms with work &Ogr;(t . n &egr;) contingent upon our conjecture. We show that our method yields a faster construction of &Ogr;(t . n &egr;) Write-All algorithms than the method developed by Anderson and Woll. Finally we show that our approach yields more efficient Write-all algorithms as compared to the algorithms induced by the constructions of Naor and Roth for the same asymptotic work complexity.