Deterministic parallel random-number generation for dynamic-multithreading platforms

  • Authors:
  • Charles E. Leiserson;Tao B. Schardl;Jim Sukha

  • Affiliations:
  • MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA

  • Venue:
  • Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing concurrency platforms for dynamic multithreading do not provide repeatable parallel random-number generators. This paper proposes that a mechanism called pedigrees be built into the runtime system to enable efficient deterministic parallel random-number generation. Experiments with the open-source MIT Cilk runtime system show that the overhead for maintaining pedigrees is negligible. Specifically, on a suite of 10 benchmarks, the relative overhead of Cilk with pedigrees to the original Cilk has a geometric mean of less than 1%. We persuaded Intel to modify its commercial C/C++ compiler, which provides the Cilk Plus concurrency platform, to include pedigrees, and we built a library implementation of a deterministic parallel random-number generator called DotMix that compresses the pedigree and then "RC6-mixes" the result. The statistical quality of DotMix is comparable to that of the popular Mersenne twister, but somewhat slower than a nondeterministic parallel version of this efficient and high-quality serial random-number generator. The cost of calling DotMix depends on the "spawn depth" of the invocation. For a naive Fibonacci calculation with n=40 that calls DotMix in every node of the computation, this "price of determinism" is a factor of 2.65 in running time, but for more realistic applications with less intense use of random numbers -- such as a maximal-independent-set algorithm, a practical samplesort program, and a Monte Carlo discrete-hedging application from QuantLib -- the observed "price" was less than 5%. Moreover, even if overheads were several times greater, applications using DotMix should be amply fast for debugging purposes, which is a major reason for desiring repeatability.