A stencil compiler for short-vector SIMD architectures

  • Authors:
  • Tom Henretty;Richard Veras;Franz Franchetti;Louis-Noël Pouchet;J. Ramanujam;P. Sadayappan

  • Affiliations:
  • The Ohio State University, Columbus, OH, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;University of California, Los Angeles, Los Angeles, OH, USA;Louisiana State University, Baton Rouge, LA, USA;The Ohio State University, Columbus, OH, USA

  • Venue:
  • Proceedings of the 27th international ACM conference on International conference on supercomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Stencil computations are an integral component of applications in a number of scientific computing domains. Short-vector SIMD instruction sets are ubiquitous on modern processors and can be used to significantly increase the performance of stencil computations. Traditional approaches to optimizing stencils on these platforms have focused on either short-vector SIMD or data locality optimizations. In this paper, we propose a domain specific language and compiler for stencil computations that allows specification of stencils in a concise manner and automates both locality and short-vector SIMD optimizations, along with effective utilization of multi-core parallelism. Loop transformations to enhance data locality and enable load-balanced parallelism are combined with a data layout transformation to effectively increase the performance of stencil computations. Performance increases are demonstrated for a number of stencils on several modern SIMD architectures.