When polyhedral transformations meet SIMD code generation

  • Authors:
  • Martin Kong;Richard Veras;Kevin Stock;Franz Franchetti;Louis-Noël Pouchet;P. Sadayappan

  • Affiliations:
  • Ohio State University, Columbus, OH, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Ohio State University, Columbus, OH, USA;Carnegie Mellon University, Pittsburgh, PA, USA;University of California Los Angeles, Los Angeles, CA, USA;Ohio State University, Columbus, OH, USA

  • Venue:
  • Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data locality and parallelism are critical optimization objectives for performance on modern multi-core machines. Both coarse-grain parallelism (e.g., multi-core) and fine-grain parallelism (e.g., vector SIMD) must be effectively exploited, but despite decades of progress at both ends, current compiler optimization schemes that attempt to address data locality and both kinds of parallelism often fail at one of the three objectives. We address this problem by proposing a 3-step framework, which aims for integrated data locality, multi-core parallelism and SIMD execution of programs. We define the concept of vectorizable codelets, with properties tailored to achieve effective SIMD code generation for the codelets. We leverage the power of a modern high-level transformation framework to restructure a program to expose good ISA-independent vectorizable codelets, exploiting multi-dimensional data reuse. Then, we generate ISA-specific customized code for the codelets, using a collection of lower-level SIMD-focused optimizations. We demonstrate our approach on a collection of numerical kernels that we automatically tile, parallelize and vectorize, exhibiting significant performance improvements over existing compilers.