High-performance Physics Simulations Using Multi-core CPUs and GPGPUs in a Volunteer Computing Context

Authors:
Kamran Karimi;Neil Dickson;Firas Hamze
Affiliations:
D-Wave Systems Inc., 100-4401 Still Creek Drive, Burnaby,British Columbia, Canada V5C 6G9,;D-Wave Systems Inc., 100-4401 Still Creek Drive, Burnaby,British Columbia, Canada V5C 6G9;D-Wave Systems Inc., 100-4401 Still Creek Drive, Burnaby,British Columbia, Canada V5C 6G9
Venue:
International Journal of High Performance Computing Applications
Year:
2011

Citing 6
Cited 2

Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
BOINC: A System for Public-Resource Computing and Storage

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Monte Carlo Statistical Methods (Springer Texts in Statistics)

Monte Carlo Statistical Methods (Springer Texts in Statistics)
Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics)

Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics)
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
Programming Massively Parallel Processors: A Hands-on Approach

Programming Massively Parallel Processors: A Hands-on Approach

Importance of explicit vectorization for CPU and GPU software performance

Journal of Computational Physics
Investigating the performance of an adiabatic quantum optimization processor

Quantum Information Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents two conceptually simple methods for parallelizing a Parallel Tempering Monte Carlo simulation in a distributed volunteer computing context, where computers belonging to the general public are used. The first method uses conventional multi-threading. The second method uses CUDA, a graphics card computing system. Parallel Tempering is described, and challenges such as parallel random number generation and mapping of Monte Carlo chains to different threads are explained. While conventional multi-threading on central processing units is well-established, GPGPU programming techniques and technologies are still developing and present several challenges, such as the effective use of a relatively large number of threads. Having multiple chains in Parallel Tempering allows parallelization in a manner that is similar to the serial algorithm. Volunteer computing introduces important constraints to high performance computing, and we show that both versions of the application are able to adapt themselves to the varying and unpredictable computing resources of volunteersâ聙聶 computers, while leaving the machines responsive enough to use. We present experiments to show the scalable performance of these two approaches, and indicate that the efficiency of the methods increases with bigger problem sizes.