Optimising purely functional GPU programs

  • Authors:
  • Trevor L. McDonell;Manuel M.T. Chakravarty;Gabriele Keller;Ben Lippmeier

  • Affiliations:
  • University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia

  • Venue:
  • Proceedings of the 18th ACM SIGPLAN international conference on Functional programming
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However, the naive compilation of such programs quickly leads to both code explosion and an excessive use of intermediate data structures. The resulting slow-down is not acceptable on target hardware that is usually chosen to achieve high performance. In this paper, we discuss two optimisation techniques, sharing recovery and array fusion, that tackle code explosion and eliminate superfluous intermediate structures. Both techniques are well known from other contexts, but they present unique challenges for an embedded language compiled for execution on a GPU. We present novel methods for implementing sharing recovery and array fusion, and demonstrate their effectiveness on a set of benchmarks.