Polyhedral parallel code generation for CUDA

  • Authors:
  • Sven Verdoolaege;Juan Carlos Juega;Albert Cohen;José Ignacio Gómez;Christian Tenllado;Francky Catthoor

  • Affiliations:
  • INRIA and École Normale Supérieure;Universidad Complutense de Madrid;INRIA and École Normale Supérieure;Universidad Complutense de Madrid;Universidad Complutense de Madrid;IMEC

  • Venue:
  • ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article addresses the compilation of a sequential program for parallel execution on a modern GPU. To this end, we present a novel source-to-source compiler called PPCG. PPCG singles out for its ability to accelerate computations from any static control loop nest, generating multiple CUDA kernels when necessary. We introduce a multilevel tiling strategy and a code generation scheme for the parallelization and locality optimization of imperfectly nested loops, managing memory and exposing concurrency according to the constraints of modern GPUs. We evaluate our algorithms and tool on the entire PolyBench suite.