Efficient low-contention parallel algorithms

  • Authors:
  • Phillip B. Gibbons;Yossi Matias;Vijaya Ramachandran

  • Affiliations:
  • AT&T Bell Laboratories, 600 Mountain Ave., Murray Hill NJ;AT&T Bell Laboratories, 600 Mountain Ave., Murray Hill NJ;Dept. of Computer Sciences, University of Texas, Austin TX

  • Venue:
  • SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects the contention properties of most parallel machines more accurately than either the well-studied CRCW or EREW models: the CRCW model does not adequately penalize algorithms with high contention to shared memory locations, while the EREW model is too strict in its insistence on zero contention at each step. Of primary practical and theoretical interest, then, is the design of fast and efficient QRQW algorithms for problems for which all previous algorithms either suffer from high contention, fail to be fast, or fail to be work-optimal.This paper describes low-contention, fast, work-optimal QRQW PRAM algorithms for the fundamental problems of finding a random permutation, parallel hashing, load balancing, and sorting. There is no known fast, work-optimal EREW algorithm known for finding a random permutation or for parallel hashing. For load balancing, we improve upon the EREW result whenever the ratio of the maximum to the average load is not too large. We show that the logarithmic dependence of the QRQW running time on this ratio is inherent by providing a matching lower bound.We demonstrate the performance advantage of a QRQW random permutation algorithm, compared with the popular EREW algorithm, by implementing and running both algorithms on the MasPar MP-1.Finally, we extend the work-time framework for the design of parallel algorithms to account for contention, and relate it to the QRQW PRAM model. We use our QRQW load balancing algorithm, as well as the QRQW linear compaction algorithm in [GMR94], to provide automatic tools for processor allocation—an issue that needs to be handled when translating an algorithm from its work-time presentation into the explicit PRAM description.