Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics
The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics
Numerical methods for nonconservative hyperbolic systems: a theoretical framework.
SIAM Journal on Numerical Analysis
Journal of Computational Physics
Journal of Computational and Applied Mathematics
High Order Extensions of Roe Schemes for Two-Dimensional Nonconservative Hyperbolic Systems
Journal of Scientific Computing
OpenGL(R) Programming Guide: The Official Guide to Learning OpenGL(R), Version 2.1
OpenGL(R) Programming Guide: The Official Guide to Learning OpenGL(R), Version 2.1
Programming CUDA-based GPUs to simulate two-layer shallow water flows
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Journal of Scientific Computing
Simulation of one-layer shallow water systems on multicore and CUDA architectures
The Journal of Supercomputing
Shallow water simulations on multiple GPUs
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
An MPI-CUDA implementation of an improved Roe method for two-layer shallow water systems
Journal of Parallel and Distributed Computing
Parallelization of shallow water simulations on current multi-threaded systems
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
This paper addresses the speedup of the numerical solution of shallow-water systems in 2D domains by using modern graphics processing units (GPUs). A first order well-balanced finite volume numerical scheme for 2D shallow-water systems is considered. The potential data parallelism of this method is identified and the scheme is efficiently implemented on GPUs for one-layer shallow-water systems. Numerical experiments performed on several GPUs show the high efficiency of the GPU solver in comparison with a highly optimized implementation of a CPU solver.