OpenCL framework for ARM processors with NEON support

Authors:
Gangwon Jo;Won Jong Jeon;Wookeun Jung;Gordon Taft;Jaejin Lee
Affiliations:
Seoul National University, Seoul, South Korea;Samsung Research America - Silicon Valley, San Jose, CA, USA;Seoul National University, Seoul, South Korea;Samsung Research America - Silicon Valley, San Jose, CA, USA;Seoul National University, Seoul, South Korea
Venue:
Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
Year:
2014

Citing 11
Cited 0

MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs

Languages and Compilers for Parallel Computing
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
An OpenCL framework for heterogeneous multicores with local memory

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU

PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
An Evaluation of Vectorizing Compilers

PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Whole-function vectorization

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Dynamic compilation of data-parallel kernels for vector processors

Proceedings of the Tenth International Symposium on Code Generation and Optimization
Improving performance of OpenCL on CPUs

CC'12 Proceedings of the 21st international conference on Compiler Construction
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters

Proceedings of the 26th ACM international conference on Supercomputing
Performance characterization of the NAS Parallel Benchmarks in OpenCL

IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization

Quantified Score

Hi-index	0.00

Visualization

Abstract

The state-of-the-art ARM processors provide multiple cores and SIMD instructions. OpenCL is a promising programming model for utilizing such parallel processing capability because of its SPMD programming model and built-in vector support. Moreover, it provides portability between multicore ARM processors and accelerators in embedded systems. In this paper, we introduce the design and implementation of an efficient OpenCL framework for multicore ARM processors. Computational tasks in a program are implemented as OpenCL kernels and run on all CPU cores in parallel by our OpenCL framework. Vector operations and built-in functions in OpenCL kernels are optimized using the NEON SIMD instruction set. We evaluate our OpenCL framework using 37 benchmark applications. The result shows that our approach is effective and promising.