Compile-time generation of regular communications patterns
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Generating local addresses and communication sets for data-parallel programs
Journal of Parallel and Distributed Computing
A linear-time algorithm for computing the memory access sequence in data-parallel programs
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient address generation for block-cyclic distributions
ICS '95 Proceedings of the 9th international conference on Supercomputing
Compiling Global Name-Space Parallel Loops for Distributed Execution
IEEE Transactions on Parallel and Distributed Systems
Fast Address Sequence Generation for Data-Parallel Programs Using Integer Lattices
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Efficient Compilation of Array Statements for Private Memory Multicomputers
Efficient Compilation of Array Statements for Private Memory Multicomputers
Scheduling Block-Cyclic Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Transparent adaptive parallelism on NOWs using OpenMP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Algorithmic Redistribution Methods for Block-Cyclic Decompositions
IEEE Transactions on Parallel and Distributed Systems
Integer lattice based methods for local address generation for block-cyclic distributions
Compiler optimizations for scalable parallel systems
More on Scheduling Block-Cyclic Array Redistribution
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A pipeline technique for dynamic data transfer on a multiprocessor grid
International Journal of Parallel Programming
A message passing strategy for array redistributions in a torus network
The Journal of Supercomputing
Hi-index | 0.00 |
Compiling the array assignment statement of High Performance Fortran in the presence of block-cyclic distributions of data arrays is considered difficult, and several algorithms have been published to solve this problem. We present a comprehensive study of the performance of these algorithms. We classify these algorithms into several families and identify several issues of interest in the compilation process, and present experimental performance data for the various algorithms. We demonstrate that block-cyclic distributions can be compiled almost as efficiently as block and cyclic distributions.