Multivariate Gaussian Random Number Generation Targeting Reconfigurable Hardware

  • Authors:
  • David B. Thomas;Wayne Luk

  • Affiliations:
  • Imperial College London;Imperial College London

  • Venue:
  • ACM Transactions on Reconfigurable Technology and Systems (TRETS)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The multivariate Gaussian distribution is often used to model correlations between stochastic time-series, and can be used to explore the effect of these correlations across N time-series in Monte-Carlo simulations. However, generating random correlated vectors is an O(N2) process, and quickly becomes a computational bottleneck in software simulations. This article presents an efficient method for generating vectors in parallel hardware, using N parallel pipelined components to generate a new vector every N cycles. This method maps well to the embedded block RAMs and multipliers in contemporary FPGAs, particularly as extensive testing shows that the limited bit-width arithmetic does not reduce the statistical quality of the generated vectors. An implementation of the architecture in the Virtex-4 architecture achieves a 500MHz clock-rate, and can support vector lengths up to 512 in the largest devices. The combination of a high clock-rate and parallelism provides a significant performance advantage over conventional processors, with an xc4vsx55 device at 500MHz providing a 200 times speedup over an Opteron 2.6GHz using an AMD optimised BLAS package. In a case study in Delta-Gamma Value-at Risk, an RC2000 accelerator card using an xc4vsx55 at 400MHz is 26 times faster than a quad Opteron 2.6GHz SMP.