Accelerating geoscience and engineering system simulations on graphics hardware
Computers & Geosciences
3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Comparing the performance of stochastic simulation on GPUs and OpenMP
International Journal of Computational Science and Engineering
Comparison of different propagation steps for lattice Boltzmann methods
Computers & Mathematics with Applications
The Journal of Supercomputing
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics
The Journal of Supercomputing
Hi-index | 0.00 |
Lattice Boltzmann Methods (LBM) are used for the computational simulation of Newtonian fluid dynamics. LBM-based simulations are readily parallelizable; they have been implemented on general-purpose processors, field-programmable gate arrays (FPGAs), and graphics processing units (GPUs). Of the three methods, the GPU implementations achieved the highest simulation performance per chip. With memory bandwidth of up to 141 GB/s and a theoretical maximum floating point performance of over 600 GFLOPS, CUDA-ready GPUs from NVIDIA provide an attractive platform for a wide range of scientific simulations, including LBM. This paper improves upon prior single-precision GPU LBM results for the D3Q19 model by increasing GPU multiprocessor occupancy, resulting in an increase in maximum performance by 20%, and by introducing a space-efficient storage method which reduces GPU RAM requirements by 50% at a slight detriment to performance. Both GPU implementations are over 28 times faster than a single-precision quad-core CPU version utilizing OpenMP.