Automatic data layout for high performance Fortran
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fortran RED - A Retargetable Environment for Automatic Data Layout
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
We have developed an automatic compile-time computation and data decomposition technique for distributed memory machines. Our method can handle complex programs containing perfect and nonperfect loop nests with or without loop-carried dependences. Applying our decomposition algorithms, a program is divided into collections (called clusters) of loop nests, such that data redistributions are allowed only between the clusters. Within each cluster of loop nests, decomposition and data locality constraints are formulated as a system of homogeneous linear equations which is solved by polynomial time algorithms. Our algorithm can selectively relax data locality constraints within a cluster to achieve a balance between parallelism and data locality. Such relaxations are guided by exploiting the hierarchical program nesting structures from outer to inner nesting levels to keep the communications at an outer-most level possible. This work is central to the on-going compiler development effort under the EPPP (Environment for Portable Parallel Programming) project. A brief discussion of the current implementation is included.