Introduction to algorithms
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
Generating local addresses and communication sets for data-parallel programs
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The high performance Fortran handbook
The high performance Fortran handbook
Generating communication for array statements: design, implementation, and evaluation
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Compilation techniques for block-cyclic distributions
ICS '94 Proceedings of the 8th international conference on Supercomputing
High performance Fortran: a practical analysis
Scientific Programming
Efficient address generation for block-cyclic distributions
ICS '95 Proceedings of the 9th international conference on Supercomputing
Efficient address generation for block-cyclic distributions
ICS '95 Proceedings of the 9th international conference on Supercomputing
An Implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems
IEEE Transactions on Parallel and Distributed Systems
Analysis of local enumeration and storage schemes in HPF
ICS '96 Proceedings of the 10th international conference on Supercomputing
Scheduling Block-Cyclic Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Using integer sets for data-parallel program analysis and optimization
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Communication Generation for Aligned and Cyclic(K) Distributions Using Integer Lattice
IEEE Transactions on Parallel and Distributed Systems
A global communication optimization technique based on data-flow analysis and linear algebra
ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithmic Redistribution Methods for Block-Cyclic Decompositions
IEEE Transactions on Parallel and Distributed Systems
Efficient Address Generation for Affine Subscripts in Data-Parallel Programs
The Journal of Supercomputing
Runtime performance of parallel array assignment: an empirical study
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Advanced code generation for high performance Fortran
Compiler optimizations for scalable parallel systems
Integer lattice based methods for local address generation for block-cyclic distributions
Compiler optimizations for scalable parallel systems
Algorithms for Supporting Compiled Communication
IEEE Transactions on Parallel and Distributed Systems
Generating efficient local memory access sequences for coupled subscripts in data-parallel programs
Information Sciences—Informatics and Computer Science: An International Journal
Efficient communication sets generation for block-cyclic distribution on distributed-memory machines
Journal of Systems Architecture: the EUROMICRO Journal
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Code composition as an implementation language for compilers
DSL'97 Proceedings of the Conference on Domain-Specific Languages on Conference on Domain-Specific Languages (DSL), 1997
Hi-index | 0.00 |
Data-parallel languages, such as High Performance Fortran, are widely regarded as a promising means for writing portable programs for distributed-memory machines. Novel features of these languages call for the development of new techniques in both compilers and run-time systems. In this paper, we present an improved algorithm for finding the local memory access sequence in computations involving regular sections of arrays with cyclic(k) distributions. After establishing the fact that regular section indices correspond to elements of an integer lattice, we show how to find a lattice basis that allows for simple and fast enumeration of memory accesses. The complexity of our algorithm is shown to be lower than that of the previous solution for the same problem. In addition, the experimental results demonstrate the efficiency of our method in practice.