Compiler blockability of numerical algorithms
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Access normalization: loop restructuring for NUMA computers
ACM Transactions on Computer Systems (TOCS)
On the Choice of Wavespeeds for the HLLC Riemann Solver
SIAM Journal on Scientific Computing
Journal of Computational Physics
Hyperbolic divergence cleaning for the MHD equations
Journal of Computational Physics
Iteration Space Tiling for Memory Hierarchies
Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing
Journal of Computational Physics
A multi-state HLL approximate Riemann solver for ideal magnetohydrodynamics
Journal of Computational Physics
An unsplit staggered mesh scheme for multidimensional magnetohydrodynamics
Journal of Computational Physics
Programming Massively Parallel Processors: A Hands-on Approach
Programming Massively Parallel Processors: A Hands-on Approach
Hi-index | 31.45 |
We describe our experience using NVIDIA@?s CUDA (Compute Unified Device Architecture) C programming environment to implement a two-dimensional second-order MUSCL-Hancock ideal magnetohydrodynamics (MHD) solver on a GTX 480 Graphics Processing Unit (GPU). Taking a simple approach in which the MHD variables are stored exclusively in the global memory of the GTX 480 and accessed in a cache-friendly manner (without further optimizing memory access by, for example, staging data in the GPU@?s faster shared memory), we achieved a maximum speed-up of ~126 for a 1024^2 grid relative to the sequential C code running on a single Intel Nehalem (2.8 GHz) core. This speedup is consistent with simple estimates based on the known floating point performance, memory throughput and parallel processing capacity of the GTX 480.