Efficient low-contention parallel algorithms

Authors:
Phillip B. Gibbons;Yossi Matias;Vijaya Ramachandran
Affiliations:
AT&T Bell Laboratories, 600 Mountain Ave., Murray Hill NJ;AT&T Bell Laboratories, 600 Mountain Ave., Murray Hill NJ;Dept. of Computer Sciences, University of Texas, Austin TX
Venue:
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Year:
1994

Citing 25
Cited 6

Storing a Sparse Table with 0(1) Worst Case Access Time

Journal of the ACM (JACM)
Sorting in c log n parallel steps

Combinatorica
Parallel merge sort

SIAM Journal on Computing
Optimal and sublogarithmic time randomized parallel sorting algorithms

SIAM Journal on Computing
Faster optimal parallel prefix sums and list ranking

Information and Computation
Hybridsort revisited and parallelized

Information Processing Letters
Parallel iterated bucket sort

Information Processing Letters
Scans as Primitive Parallel Operations

IEEE Transactions on Computers
A complexity theory of efficient parallel algorithms

Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A bridging model for parallel computation

Communications of the ACM
A new universal class of hash functions and dynamic hashing in real time

Proceedings of the seventeenth international colloquium on Automata, languages and programming
Converting high probability into nearly-constant time—with applications to parallel hashing

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Fast parallel generation of random permutations

Proceedings of the 18th international colloquium on Automata, languages and programming
Parallel algorithms for shared-memory machines

Handbook of theoretical computer science (vol. A)
Towards a theory of nearly constant time parallel algorithms

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
Fast hashing on a PRAM—designing by expectation

SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
Ultra-fast expected time parallel algorithms

SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
An introduction to parallel algorithms

An introduction to parallel algorithms
Improved parallel integer sorting without concurrent writing

SODA '92 Proceedings of the third annual ACM-SIAM symposium on Discrete algorithms
Implementation of a portable nested data-parallel language

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The QRQW PRAM: accounting for contention in parallel algorithms

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
The Parallel Evaluation of General Arithmetic Expressions

Journal of the ACM (JACM)
Synthesis of Parallel Algorithms

Synthesis of Parallel Algorithms
Simple Fast Parallel Hashing

ICALP '94 Proceedings of the 21st International Colloquium on Automata, Languages and Programming
Graph Theory With Applications

Graph Theory With Applications

An optical simulation of shared memory

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Accounting for memory bank contention and delay in high-bandwidth multiprocessors

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Asynchronous shared memory search structures

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Asynchrony versus bulk-synchrony in QRQW PRAM models

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Portable and Efficient Parallel Computing Using the BSP Model

IEEE Transactions on Computers
Lower Bounds for Randomized Exclusive Write PRAMs

Theory of Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects the contention properties of most parallel machines more accurately than either the well-studied CRCW or EREW models: the CRCW model does not adequately penalize algorithms with high contention to shared memory locations, while the EREW model is too strict in its insistence on zero contention at each step. Of primary practical and theoretical interest, then, is the design of fast and efficient QRQW algorithms for problems for which all previous algorithms either suffer from high contention, fail to be fast, or fail to be work-optimal.This paper describes low-contention, fast, work-optimal QRQW PRAM algorithms for the fundamental problems of finding a random permutation, parallel hashing, load balancing, and sorting. There is no known fast, work-optimal EREW algorithm known for finding a random permutation or for parallel hashing. For load balancing, we improve upon the EREW result whenever the ratio of the maximum to the average load is not too large. We show that the logarithmic dependence of the QRQW running time on this ratio is inherent by providing a matching lower bound.We demonstrate the performance advantage of a QRQW random permutation algorithm, compared with the popular EREW algorithm, by implementing and running both algorithms on the MasPar MP-1.Finally, we extend the work-time framework for the design of parallel algorithms to account for contention, and relate it to the QRQW PRAM model. We use our QRQW load balancing algorithm, as well as the QRQW linear compaction algorithm in [GMR94], to provide automatic tools for processor allocation—an issue that needs to be handled when translating an algorithm from its work-time presentation into the explicit PRAM description.