A Loop Transformation Algorithm for Communication Overlapping

  • Authors:
  • Kazuaki Ishizaki;Hideaki Komatsu;Toshio Nakatani

  • Affiliations:
  • 1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan;1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan;1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan

  • Venue:
  • International Journal of Parallel Programming - Special issue on international symposium on high performance computing 1997, part I
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Overlapping communication with computation is a well-known approach to improving performance. Previous research has focused on optimizations performed by the programmer. This paper presents a compiler algorithm that automatically determines the appropriate loop indices of a given nested loop and applies loop interchange and tiling in order to overlap communication with computation. The algorithm avoids generating redundant communication by providing a framework for combining information on data dependence, communication, and reuse. It also describes a method of generating messages to exchange data between processors for tiled loops on distributed memory machines. The algorithm has been implemented in our High Performance Fortran (HPF) compiler, and experimental results have shown its effectiveness on distributed memory machines, such as the RISC System/6000 Scalable POWERparallel System. This paper also discusses the architectural problems of efficient optimization.