Using compiler directives for accelerating CFD applications on GPUs

Authors:
Haoqiang Jin;Mark Kellogg;Piyush Mehrotra
Affiliations:
NAS Division, NASA Ames Research Center, Moffett Field, CA;NAS Division, NASA Ames Research Center, Moffett Field, CA;NAS Division, NASA Ames Research Center, Moffett Field, CA
Venue:
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Year:
2012

Citing 5
Cited 0

Programming Massively Parallel Processors: A Hands-on Approach

Programming Massively Parallel Processors: A Hands-on Approach
Acceleration of a CFD code with a GPU

Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark

ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
OpenMP for accelerators

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Performance characterization of the NAS Parallel Benchmarks in OpenCL

IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the current trend of parallel systems is towards a cluster of multi-core nodes enhanced with accelerators, software development for such systems has become a major challenge. Both low-level and high-level programming models have been developed to address complex hierarchical structures at different hardware levels and to ease the programming effort. However, achieving the desired performance goal is still not a simple task. In this study, we describe our experience with using the accelerator directives developed by the Portland Group to port a computational fluid dynamics (CFD) application benchmark to a general-purpose GPU platform. Our work focuses on the usability of this approach and examines the programming effort and achieved performance on two Nvidia GPU-based systems. The study shows very promising results in terms of programmability as well as performance when compared to other approaches such as the CUDA programming model.