Adaptive input-aware compilation for graphics engines

  • Authors:
  • Mehrzad Samadi;Amir Hormati;Mojtaba Mehrara;Janghaeng Lee;Scott Mahlke

  • Affiliations:
  • University of Michigan, Ann Arbor, MI, USA;Microsoft Research, Redmond, WA, USA;NVIDIA Research, Santa Clara, CA, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA

  • Venue:
  • Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

While graphics processing units (GPUs) provide low-cost and efficient platforms for accelerating high performance computations, the tedious process of performance tuning required to optimize applications is an obstacle to wider adoption of GPUs. In addition to the programmability challenges posed by GPU's complex memory hierarchy and parallelism model, a well-known application design problem is target portability across different GPUs. However, even for a single GPU target, changing a program's input characteristics can make an already-optimized implementation of a program perform poorly. In this work, we propose Adaptic, an adaptive input-aware compilation system to tackle this important, yet overlooked, input portability problem. Using this system, programmers develop their applications in a high-level streaming language and let Adaptic undertake the difficult task of input portable optimizations and code generation. Several input-aware optimizations are introduced to make efficient use of the memory hierarchy and customize thread composition. At runtime, a properly optimized version of the application is executed based on the actual program input. We perform a head-to-head comparison between the Adaptic generated and hand-optimized CUDA programs. The results show that Adaptic is capable of generating codes that can perform on par with their hand-optimized counterparts over certain input ranges and outperform them when the input falls out of the hand-optimized programs' "comfort zone". Furthermore, we show that input-aware results are sustainable across different GPU targets making it possible to write and optimize applications once and run them anywhere.