Simple optimizations for an applicative array language for graphics processors

  • Authors:
  • Bradford Larsen

  • Affiliations:
  • Tufts University, Medford, MA, USA

  • Venue:
  • Proceedings of the sixth workshop on Declarative aspects of multicore programming
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Graphics processors (GPUs) are highly parallel devices that promise high performance, and they are now flexible enough to be used for general-purpose computing. A programming language based on implicitly data-parallel collective array operations can permit high-level, effective programming of GPUs. I describe three optimizations for such a language: automatic use of GPU shared memory cache, array fusion, and hoisting of nested parallel constructs. These optimizations are simple to implement because of the design of the language to which they are applied but can result in large run-time speedups.