Hybrid programming model for implicit PDE simulations on multicore architectures

Authors:
Dinesh Kaushik;David Keyes;Satish Balay;Barry Smith
Affiliations:
King Abdullah University of Science and Technology, Saudi Arabia;King Abdullah University of Science and Technology, Saudi Arabia;Argonne National Laboratory, Argonne, IL;Argonne National Laboratory, Argonne, IL
Venue:
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Year:
2011

Citing 10
Cited 1

A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
High-performacne parallel implicit CFD

Parallel Computing - Special issue on parallel computing in aerospace
Reducing the bandwidth of sparse symmetric matrices

ACM '69 Proceedings of the 1969 24th national conference
Jacobian-free Newton-Krylov methods: a survey of approaches and applications

Journal of Computational Physics
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Enabling high-fidelity neutron transport simulations on petascale architectures

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Scalable implicit finite element solver for massively parallel processing with demonstration to 160K cores

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Early experiments with the OpenMP/MPI hybrid programming model

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism

Optimizing the advanced accelerator simulation framework synergia using OpenMP

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World

Quantified Score

Hi-index	0.00

Visualization

Abstract

The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.