accULL: an OpenACC implementation with CUDA and OpenCL support

Authors:
Ruym$#225;n Reyes;Iv$#225;n López-Rodríguez;Juan J. Fumero;Francisco de Sande
Affiliations:
Dept. de E.I.O. y Computación, Universidad de La Laguna, La Laguna, Spain;Dept. de E.I.O. y Computación, Universidad de La Laguna, La Laguna, Spain;Dept. de E.I.O. y Computación, Universidad de La Laguna, La Laguna, Spain;Dept. de E.I.O. y Computación, Universidad de La Laguna, La Laguna, Spain
Venue:
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Year:
2012

Citing 5
Cited 4

Heterogeneous multicore parallel programming for graphics processing units

Scientific Programming - Software Development for Multi-core Computing Systems
Implementing the PGI Accelerator model

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
Guided performance analysis combining profile and trace tools

Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Optimization strategies in different CUDA architectures using llCoMP

Microprocessors & Microsystems

Input-aware auto-tuning for directive-based GPU programming

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
A preliminary evaluation of OpenACC implementations

The Journal of Supercomputing
On Expressing Strategies for Directive-Driven Multicore Programing Models

Proceedings of Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The irruption in the HPC scene of hardware accelerators, like GPUs, has made available unprecedented performance to developers. However, even expert developers may not be ready to exploit the new complex processor hierarchies. We need to find a way to leverage the programming effort in these devices at programming language level, otherwise, developers will spend most of their time focusing on device-specific code instead of implementing algorithmic enhancements. The recent advent of the OpenACC standard for heterogeneous computing represents an effort in this direction. This initiative, combined with future releases of the OpenMP standard, will converge into a fully heterogeneous framework that will cope the programming requirements of future computer architectures. In this work we present accULL , a novel implementation of the OpenACC standard, based on the combination of a source to source compiler and a runtime library. To our knowledge, our approach is the first providing support for both OpenCL and CUDA platforms under this new standard.