Compile-time generation of regular communications patterns
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Vienna Fortran—a Fortran language extension for distributed memory multiprocessors
Languages, compilers and run-time environments for distributed memory machines
Generating communication for array statements: design, implementation, and evaluation
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Generating local addresses and communication sets for data-parallel programs
Journal of Parallel and Distributed Computing
A linear-time algorithm for computing the memory access sequence in data-parallel programs
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient address generation for block-cyclic distributions
ICS '95 Proceedings of the 9th international conference on Supercomputing
Compiling array expressions for efficient execution on distributed-memory machines
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Communication generation for data-parallel languages
Communication generation for data-parallel languages
Compiling Global Name-Space Parallel Loops for Distributed Execution
IEEE Transactions on Parallel and Distributed Systems
LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
Code Generation for Complex Subscripts in Data-Parallel Programs
LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
Efficient Compilation of Array Statements for Private Memory Multicomputers
Efficient Compilation of Array Statements for Private Memory Multicomputers
Hi-index | 0.00 |
Generating the local memory access sequences is an integral part of compiling a data-parallel program into an SPMD code. Most previous research into local memory access sequences have focused on one-dimensional arrays distributed with CYCLIC(k) distribution. The local memory access sequences for multidimensional arrays with independent subscripts are produced by repeatedly applying the method for one-dimensional arrays. However, the task becomes highly complex when subscripts are coupled such that the subscripts in different dimensions depend on the same loop induction variables. This paper presents an efficient approach to computing the iterations executed on each processor by exploiting repetitive patterns in memory accesses. Smaller iteration tables than those of Ramanujam [Code generation for complex subscripts in data-parallel programs, in: Z. Li et al. (Eds.), Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, vol. 1366, Springer-Verlag, Berlin, 1998, pp. 49-63] are used, the iteration gap table is not required. The method has been implemented on an IBM SP2. Experimental results demonstrate the efficiency of the proposed method.