Journal of Computational Physics
The C++ Programming Language, Third Edition
The C++ Programming Language, Third Edition
Journal of Computational Physics
Simulation of shallow-water systems using graphics processing units
Mathematics and Computers in Simulation
Programming CUDA-based GPUs to simulate two-layer shallow water flows
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Simulation of one-layer shallow water systems on multicore and CUDA architectures
The Journal of Supercomputing
On the well-balanced numerical discretization of shallow water equations on unstructured meshes
Journal of Computational Physics
The Journal of Supercomputing
Parallelization of shallow water simulations on current multi-threaded systems
International Journal of High Performance Computing Applications
Hi-index | 7.31 |
The goal of this paper is to construct efficient parallel solvers for 2D hyperbolic systems of conservation laws with source terms and nonconservative products. The method of lines is applied: at every intercell a projected Riemann problem along the normal direction is considered which is discretized by means of well-balanced Roe methods. The resulting 2D numerical scheme is explicit and first-order accurate. In [M.J. Castro, J.A. Garcia, J.M. Gonzalez, C. Pares, A parallel 2D Finite Volume scheme for solving systems of balance laws with nonconservative products: Application to shallow flows, Comput. Methods Appl. Mech. Engrg. 196 (2006) 2788-2815] a domain decomposition method was used to parallelize the resulting numerical scheme, which was implemented in a PC cluster by means of MPI techniques. In this paper, in order to optimize the computations, a new parallelization of SIMD type is performed at each MPI thread, by means of SSE (''Streaming SIMD Extensions''), which are present in common processors. More specifically, as the most costly part of the calculations performed at each processor consists of a huge number of small matrix and vector computations, we use the Intel^(C) Integrated Performance Primitives small matrix library. To make easy the use of this library, which is implemented using assembler and SSE instructions, we have developed a C++ wrapper of this library in an efficient way. Some numerical tests were carried out to validate the performance of the C++ small matrix wrapper. The specific application of the scheme to one-layer Shallow-Water systems has been implemented on a PC's cluster. The correct behavior of the one-layer model is assessed using laboratory data.