Producing scalable performance with OpenMP: experiments with two CFD applications
Parallel Computing - Special issue on parallel computing in aerospace
Quantifying Differences between OpenMP and MPI Using a Large-Scale Application Suite
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
OpenMP-oriented applications for distributed shared memory architectures: Research Articles
Concurrency and Computation: Practice & Experience
International Journal of High Performance Computing Applications
A Parallel Adaptive Mesh Refinement Algorithm for Solving Nonlinear Dynamical Systems
International Journal of High Performance Computing Applications
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
OpenUH: an optimizing, portable OpenMP compiler: Research Articles
Concurrency and Computation: Practice & Experience - Current Trends in Compilers for Parallel Computers (CPC2006)
The Strong correlation Between Code Signatures and Performance
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A component infrastructure for performance and power modeling of parallel scientific applications
Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance
International Journal of Parallel Programming
Modular, Fine-Grained Adaptation of Parallel Programs
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Scalability Evaluation of Barrier Algorithms for OpenMP
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
OpenMP versus MPI for PDE solvers based on regular sparse numerical operators
Future Generation Computer Systems
Enabling locality-aware computations in OpenMP
Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
Optimizing OpenMP parallelized DGEMM calls on SGI altix 3700
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Efficient parallel CFD-DEM simulations using OpenMP
Journal of Computational Physics
Hi-index | 0.00 |
In order to exploit the flexibility of OpenMP in parallelizing large scale multi-physics applications where different modes of parallelism are needed for efficient computation, it is first necessary to be able to scale OpenMP codes as well as MPI on large core counts. In this research we have implemented fine grained OpenMP parallelism for a large CFD code GenIDLEST and investigated the performance from 1 to 256 cores using a variety of performance optimization and measurement tools. It is shown through weak and strong scaling studies that OpenMP performance can be made to match that of MPI on the SGI Altix systems for up to 256 cores. Data placement and locality were established to be key components in obtaining good scalability with OpenMP. It is also shown that a hybrid implementation on a dual core system gives the same performance as standalone MPI or OpenMP. Finally, it is shown that in irregular multi-physics applications which do not adhere solely to the SPMD (Single Process, Multiple Data) mode of computation, as encountered in tightly coupled fluid-particulate systems, the flexibility of OpenMP can have a big performance advantage over MPI.