Low-storage, explicit Runge-Kutta schemes for the compressible Navier-Stokes equations
Applied Numerical Mathematics
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Singe: leveraging warp specialization for high performance on GPUs
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics
The Journal of Supercomputing
Hi-index | 0.00 |
Hybridization is the process of converting an application with a single level of parallelism to an application with multiple levels of parallelism. Over the past 15 years a majority of the applications that run on High Performance Computing systems have employed MPI for all of the parallelism within the application. In the Peta-Exascale computing regime, effective utilization of the hardware requires multiple levels of parallelism matched to the macro architecture of the system to achieve good performance. A hybridized code base is performance portable when sufficient parallelism is expressed in an architecture agnostic form to achieve good performance on a range of available systems. The hybridized S3D code is performance portable across today's leading many core and GPU accelerated systems. The OpenACC framework allows a unified code base to be deployed for either (Manycore CPU or Manycore CPU+GPU) while permitting architecture specific optimizations to expose new dimensions of parallelism to be utilized.