Correlation immunity and the summation generator
Lecture notes in computer sciences; 218 on Advances in cryptology---CRYPTO 85
Fast pseudorandom generators for normal and exponential variates
ACM Transactions on Mathematical Software (TOMS)
Tables of maximally equidistributed combined LFSR generators
Mathematics of Computation
Computer methods for sampling from the exponential and normal distributions
Communications of the ACM
Algorithm 488: A Gaussian pseudo-random number generator
Communications of the ACM
A Gaussian Noise Generator for Hardware-Based Simulations
IEEE Transactions on Computers
Improved long-period generators based on linear recurrences modulo 2
ACM Transactions on Mathematical Software (TOMS)
A Hardware Gaussian Noise Generator Using the Box-Muller Method and Its Error Analysis
IEEE Transactions on Computers
High Quality Uniform Random Number Generation Using LUT Optimised State-transition Matrices
Journal of VLSI Signal Processing Systems
Gaussian random number generators
ACM Computing Surveys (CSUR)
Credit Risk Modelling using Hardware Accelerated Monte-Carlo Simulation
FCCM '08 Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Machines
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A hardware gaussian noise generator using the wallace method
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
State-of-the-art in heterogeneous computing
Scientific Programming
Reconfigurable computing: productivity and performance
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
An Optimized Hardware Architecture of a Multivariate Gaussian Random Number Generator
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Non-Instruction Fetch-Based Architecture Reduces Almost 100 Percent of the Dynamic Power and Energy
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
FPGA vs. multi-core CPUs vs. GPUs: hands-on experience with a sorting application
Facing the multicore-challenge
FPGA vs. multi-core CPUs vs. GPUs: hands-on experience with a sorting application
Facing the multicore-challenge
Preliminary work on graphics processing unit based direct simulation Monte Carlo
Proceedings of the 2010 Conference on Grand Challenges in Modeling & Simulation
Genetic Programming and Evolvable Machines
A practical visualization strategy for large-scale supernovae CFD simulations
SIGGRAPH Asia 2011 Sketches
A mixed precision Monte Carlo methodology for reconfigurable accelerator systems
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Implementation of the Longstaff and Schwartz American Option Pricing Model on FPGA
Journal of Signal Processing Systems
The "Chimera": an off-the-shelf CPU/GPGPU/FPGA hybrid computing platform
International Journal of Reconfigurable Computing - Special issue on High-Performance Reconfigurable Computing
OpenCL implementation of cellular automata finite element (CAFE) method
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Scheduling of computations between GPGPU and CPU for cellular automata models
Proceedings of the 2012 SpringSim Poster & Work-In-Progress Track
FPGA-based architecture to speed-up scientific computation in seismic applications
International Journal of High Performance Systems Architecture
A Mersenne Twister Hardware Implementation for the Monte Carlo Localization Algorithm
Journal of Signal Processing Systems
GPU-Based biclustering for neural information processing
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
C-to-CoRAM: compiling perfect loop nests to the portable CoRAM abstraction
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
FPGA programming for the masses
Communications of the ACM
FPGA Programming for the Masses
Queue - Mobile Web Development
Parallel architectures for the kNN classifier -- design of soft IP cores and FPGA implementations
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
Efficient compilation of CUDA kernels for high-performance computing on FPGAs
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
Hi-index | 0.02 |
The future of high-performance computing is likely to rely on the ability to efficiently exploit huge amounts of parallelism. One way of taking advantage of this parallelism is to formulate problems as "embarrassingly parallel" Monte-Carlo simulations, which allow applications to achieve a linear speedup over multiple computational nodes, without requiring a super-linear increase in inter-node communication. However, such applications are reliant on a cheap supply of high quality random numbers, particularly for the three main maximum entropy distributions: uniform, used as a general source of randomness; Gaussian, for discrete-time simulations; and exponential, for discrete-event simulations. In this paper we look at four different types of platform: conventional multi-core CPUs (Intel Core2); GPUs (NVidia GTX 200); FPGAs (Xilinx Virtex-5); and Massively Parallel Processor Arrays (Ambric AM2000). For each platform we determine the most appropriate algorithm for generating each type of number, then calculate the peak generation rate and estimated power efficiency for each device.