Exploiting GPU Hardware Saturation for Fast Compiler Optimization

  • Authors:
  • Alberto Magni;Christophe Dubach;Michael O'Boyle

  • Affiliations:
  • School of Informatics, University of Edinburgh, United Kingdom;School of Informatics, University of Edinburgh, United Kingdom;School of Informatics, University of Edinburgh, United Kingdom

  • Venue:
  • Proceedings of Workshop on General Purpose Processing Using GPUs
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Graphics Processing Units (GPUs) are efficient devices capable of delivering high performance for general purpose computation. Realizing their full performance potential often requires extensive compiler tuning. This process is particularly expensive since it has to be repeated for each target program and platform. In this paper we study the utilization of GPU hardware resources across multiple input sizes and compiler options. In this context we introduce the notion of hardware saturation. Saturation is reached when an application is executed with a number of threads large enough to fully utilize the available hardware resources. We give experimental evidence of hardware saturation and describe its properties using 16 OpenCL kernels on 3 GPUs from Nvidia and AMD. We show that input sizes that saturates the GPU show performance stability across compiler transformations. Using the thread-coarsening transformation as an example, we show that compiler settings maintain their relative performance across input sizes within the saturation region. Leveraging these hardware and software properties we propose a technique to identify the input size at the lower bound of the saturation zone, we call it Minimum Saturation Point (MSP). By performing iterative compilation on the MSP input size we obtain results effectively applicable for much large input problems reducing the overhead of tuning by an order of magnitude on average.