Performance prediction through simulation of a hybrid MPI/OpenMP application
Parallel Computing - OpenMp
Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
Multi-level parallelism for incompressible flow computations on GPU clusters
Parallel Computing
Hi-index | 0.00 |
An efficient parallel iterative method for unstructured grids developed by the authors for SMP cluster architectures on the GeoFEM platform is presented. The method is based on a 3-level hybrid parallel programming model, including message passing for inter-SMP node communication, loop directives by OpenMP for intra-SMP node parallelization and vectorization for each processing element (PE). Simple 3D elastic linear problems with more than 8脳108 DOF have been solved by 3x3 block ICCG(0) with additive Schwarz domain decomposition and PDJDS/CM-RCM reordering on 128 SMP nodes of Hitachi SR8000/MPP parallel computer, achieving performance of 335.2 GFLOPS. The PDJDS/CM-RCM reordering method provides excellent vector and parallel performance in SMP nodes.