Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator

Authors:
Mark Gardner;Paul Sathre;Wu-Chun Feng;Gabriel Martinez
Affiliations:
-;-;-;-
Venue:
Parallel Computing
Year:
2013

Citing 9
Cited 0

Global Analysis and Transformations in Preprocessed Languages

IEEE Transactions on Software Engineering
DMS®: Program Transformations for Practical Scalable Software Evolution

Proceedings of the 26th International Conference on Software Engineering
MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs

Languages and Compilers for Parallel Computing
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Caracal: dynamic translation of runtime environments for GPUs

Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Accelerating Protein Sequence Search in a Heterogeneous Computing System

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
CU2CL: A CUDA-to-OpenCL Translator for Multi- and Many-Core Architectures

ICPADS '11 Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems
Architecture-Aware Mapping and Optimization on a 1600-Core GPU

ICPADS '11 Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems
Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU

PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

The proliferation of heterogeneous computing systems has led to increased interest in parallel architectures and their associated programming models. One of the most promising models for heterogeneous computing is the accelerator model, and one of the most cost-effective, high-performance accelerators currently available is the general-purpose, graphics processing unit (GPU). Two similar programming environments have been proposed for GPUs: CUDA and OpenCL. While there are more lines of code already written in CUDA, OpenCL is an open standard that supports a broader. Hence, there is significant interest in automatic translation from CUDA to OpenCL. The contributions of this work are three-fold: (1) an extensive characterization of the subtle challenges of translation, (2) CU2CL (CUDA to OpenCL) - an implementation of a translator, and (3) an evaluation of CU2CL with respect to coverage of CUDA, translation performance, and performance of the translated applications.