Performance characterization of the NAS Parallel Benchmarks in OpenCL

Authors:
Sangmin Seo;Gangwon Jo;Jaejin Lee
Affiliations:
Center for Manycore Programming, School of Computer Science and Engineering, Seoul National University, 151-744, Korea;Center for Manycore Programming, School of Computer Science and Engineering, Seoul National University, 151-744, Korea;Center for Manycore Programming, School of Computer Science and Engineering, Seoul National University, 151-744, Korea
Venue:
IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization
Year:
2011

Citing 0
Cited 9

OpenCL as a unified programming model for heterogeneous CPU/GPU clusters

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters

Proceedings of the 26th ACM international conference on Supercomputing
Using compiler directives for accelerating CFD applications on GPUs

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
SnuCL and an MPI+OpenCL implementation of HPL on heterogeneous CPU/GPU clusters

Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
Scaling analytics applications with OpenCL for loosely coupled heterogeneous clusters

Proceedings of the ACM International Conference on Computing Frontiers
Automatic OpenCL work-group size selection for multicore CPUs

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Exploiting heterogeneous parallelism with the Heterogeneous Programming Library

Journal of Parallel and Distributed Computing
Improving application behavior on heterogeneous manycore systems through kernel mapping

Parallel Computing
OpenCL framework for ARM processors with NEON support

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Heterogeneous parallel computing platforms, which are composed of different processors (e.g., CPUs, GPUs, FPGAs, and DSPs), are widening their user base in all computing domains. With this trend, parallel programming models need to achieve portability across different processors as well as high performance with reasonable programming effort. OpenCL (Open Computing Language) is an open standard and emerging parallel programming model to write parallel applications for such heterogeneous platforms. In this paper, we characterize the performance of an OpenCL implementation of the NAS Parallel Benchmark suite (NPB) on a heterogeneous parallel platform that consists of general-purpose CPUs and a GPU. We believe that understanding the performance characteristics of conventional workloads, such as the NPB, with an emerging programming model (i.e., OpenCL) is important for developers and researchers to adopt the programming model. We also compare the performance of the NPB in OpenCL to that of the OpenMP version. We describe the process of implementing the NPB in OpenCL and optimizations applied in our implementation. Experimental results and analysis show that the OpenCL version has different characteristics from the OpenMP version on multicore CPUs and exhibits different performance characteristics depending on different OpenCL compute devices. The results also indicate that the application needs to be rewritten or re-optimized for better performance on a different compute device although OpenCL provides source-code portability.