On the interaction of tiling and automatic parallelization

  • Authors:
  • Zhelong Pan;Brian Armstrong;Hansang Bae;Rudolf Eigenmann

  • Affiliations:
  • Purdue University, School of ECE, West Lafayette, IN;Purdue University, School of ECE, West Lafayette, IN;Purdue University, School of ECE, West Lafayette, IN;Purdue University, School of ECE, West Lafayette, IN

  • Venue:
  • IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Iteration space tiling is a well-explored programming and compiler technique to enhance program locality. Its performance benefit appears obvious, as the ratio of processor versus memory speed increases continuously. In an effort to include a tiling pass into an advanced parallelizing compiler, we have found that the interaction of tiling and parallelization raises unexplored issues. Applying existing, sequential tiling techniques, followed by parallelization, leads to performance degradation in many programs. Applying tiling after parallelization without considering parallel execution semantics may lead to incorrect programs. Doing so conservatively, also introduces overhead in some of the measured programs. In this paper, we present an algorithm that applies tiling in concert with parallelization. The algorithm avoids the above negative effects. Our paper also presents the first comprehensive evaluation of tiling techniques on compiler-parallelized programs. Our tiling algorithm improves the SPEC CPU95 floating-point programs by up to 21% over nontiled versions (4.9% on average) and the SPEC CPU2000 Fortran 77 programs up to 49% (11% on average). Notably, in about half of the benchmarks, tiling does not have a significant effect.