Automatic data and computation decomposition for distributed memory machines

  • Authors:
  • Qi Ning;V. Van Dongen;G. R. Gao

  • Affiliations:
  • -;-;-

  • Venue:
  • HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have developed an automatic compile-time computation and data decomposition technique for distributed memory machines. Our method can handle complex programs containing perfect and nonperfect loop nests with or without loop-carried dependences. Applying our decomposition algorithms, a program is divided into collections (called clusters) of loop nests, such that data redistributions are allowed only between the clusters. Within each cluster of loop nests, decomposition and data locality constraints are formulated as a system of homogeneous linear equations which is solved by polynomial time algorithms. Our algorithm can selectively relax data locality constraints within a cluster to achieve a balance between parallelism and data locality. Such relaxations are guided by exploiting the hierarchical program nesting structures from outer to inner nesting levels to keep the communications at an outer-most level possible. This work is central to the on-going compiler development effort under the EPPP (Environment for Portable Parallel Programming) project. A brief discussion of the current implementation is included.