Synthesizing Transformations for Locality Enhancement of Imperfectly-Nested Loop Nests

  • Authors:
  • Nawaaz Ahmed;Nikolay Mateev;Keshav Pingali

  • Affiliations:
  • Department of Computer Science, Cornell University, Ithaca, New York 14853;Department of Computer Science, Cornell University, Ithaca, New York 14853;Department of Computer Science, Cornell University, Ithaca, New York 14853. pingali@cs.cornell.edu

  • Venue:
  • International Journal of Parallel Programming
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linear loop transformations and tiling are known to be very effective for enhancing locality of reference in perfectly-nested loops. However, they cannot be applied directly to imperfectly-nested loops. Some compilers attempt to convert imperfectly-nested loops into perfectly-nested loops by using statement sinking, loop fusion, etc., and then apply locality enhancing transformations to the resulting perfectly-nested loops, but the approaches used are fairly ad hoc and may fail even for simple programs. In this paper, we present a systematic approach for synthesizing transformations to enhance locality in imperfectly-nested loops. The key idea is to embed the iteration space of each statement into a special iteration space called the product space. The product space can be viewed as a perfectly-nested loop nest, so embedding generalizes techniques like statement sinking and loop fusion which are used in ad hoc ways in current compilers to produce perfectly-nested loops from imperfectly-nested ones. In contrast to these ad hoc techniques however, our embeddings are chosen carefully to enhance locality. The product space can itself be transformed to increase locality further, after which fully permutable loops can be tiled. The final code generation step may produce imperfectly-nested loops as output if that is desirable. We present experimental evidence for the effectiveness of this approach, using dense numerical linear algebra benchmarks, relaxation codes, and the tomcatv code from the SPEC benchmarks.