The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Tables of linear congruential generators of different sizes and good lattice structure
Mathematics of Computation
Numerical Recipes 3rd Edition: The Art of Scientific Computing
Numerical Recipes 3rd Edition: The Art of Scientific Computing
QPACE: Quantum Chromodynamics Parallel Computing on the Cell Broadband Engine
Computing in Science and Engineering
Janus: An FPGA-Based System for High-Performance Scientific Computing
Computing in Science and Engineering
GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model
Journal of Computational Physics
Programming Massively Parallel Processors: A Hands-on Approach
Programming Massively Parallel Processors: A Hands-on Approach
Importance of explicit vectorization for CPU and GPU software performance
Journal of Computational Physics
Parallel implementation of the heisenberg model using Monte Carlo on GPGPU
ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part III
Benchmarking of communication techniques for GPUs
Journal of Parallel and Distributed Computing
Monte Carlo simulation of the Ising model on FPGA
Journal of Computational Physics
Swendsen-Wang multi-cluster algorithm for the 2D/3D Ising model on Xeon Phi and GPU
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 31.45 |
Graphics processing units (GPUs) are recently being used to an increasing degree for general computational purposes. This development is motivated by their theoretical peak performance, which significantly exceeds that of broadly available CPUs. For practical purposes, however, it is far from clear how much of this theoretical performance can be realized in actual scientific applications. As is discussed here for the case of studying classical spin models of statistical mechanics by Monte Carlo simulations, only an explicit tailoring of the involved algorithms to the specific architecture under consideration allows to harvest the computational power of GPU systems. A number of examples, ranging from Metropolis simulations of ferromagnetic Ising models, over continuous Heisenberg and disordered spin-glass systems to parallel-tempering simulations are discussed. Significant speed-ups by factors of up to 1000 compared to serial CPU code as well as previous GPU implementations are observed.