A linear-time algorithm for computing the memory access sequence in data-parallel programs

Authors:
Ken Kennedy;Nenad Nedeljkovic;Ajay Sethi
Affiliations:
Center for Research on Parallel Computation, Department of Computer Science, Rice University;Center for Research on Parallel Computation, Department of Computer Science, Rice University;Center for Research on Parallel Computation, Department of Computer Science, Rice University
Venue:
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1995

Citing 8
Cited 19

Introduction to algorithms

Introduction to algorithms
Compiling Fortran D for MIMD distributed-memory machines

Communications of the ACM
Generating local addresses and communication sets for data-parallel programs

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The high performance Fortran handbook

The high performance Fortran handbook
Generating communication for array statements: design, implementation, and evaluation

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Compilation techniques for block-cyclic distributions

ICS '94 Proceedings of the 8th international conference on Supercomputing
High performance Fortran: a practical analysis

Scientific Programming
Efficient address generation for block-cyclic distributions

ICS '95 Proceedings of the 9th international conference on Supercomputing

Efficient address generation for block-cyclic distributions

ICS '95 Proceedings of the 9th international conference on Supercomputing
An Implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems

IEEE Transactions on Parallel and Distributed Systems
Analysis of local enumeration and storage schemes in HPF

ICS '96 Proceedings of the 10th international conference on Supercomputing
Scheduling Block-Cyclic Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Using integer sets for data-parallel program analysis and optimization

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Communication Generation for Aligned and Cyclic(K) Distributions Using Integer Lattice

IEEE Transactions on Parallel and Distributed Systems
A global communication optimization technique based on data-flow analysis and linear algebra

ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithmic Redistribution Methods for Block-Cyclic Decompositions

IEEE Transactions on Parallel and Distributed Systems
Efficient Address Generation for Affine Subscripts in Data-Parallel Programs

The Journal of Supercomputing
Runtime performance of parallel array assignment: an empirical study

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Advanced code generation for high performance Fortran

Compiler optimizations for scalable parallel systems
Integer lattice based methods for local address generation for block-cyclic distributions

Compiler optimizations for scalable parallel systems
Generating communication sets of array assignment statements for block-cyclic distribution on distributed memory parallel computers

Parallel Computing
Algorithms for Supporting Compiled Communication

IEEE Transactions on Parallel and Distributed Systems
Generating efficient local memory access sequences for coupled subscripts in data-parallel programs

Information Sciences—Informatics and Computer Science: An International Journal
Efficient communication sets generation for block-cyclic distribution on distributed-memory machines

Journal of Systems Architecture: the EUROMICRO Journal
An Expression-Rewriting Framework to Generate Communication Sets for HPF Programs with Block-Cyclic Distribution

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
An efficient algorithm for communication set generation of data parallel programs with block-cyclic distribution

Parallel Computing
Code composition as an implementation language for compilers

DSL'97 Proceedings of the Conference on Domain-Specific Languages on Conference on Domain-Specific Languages (DSL), 1997

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data-parallel languages, such as High Performance Fortran, are widely regarded as a promising means for writing portable programs for distributed-memory machines. Novel features of these languages call for the development of new techniques in both compilers and run-time systems. In this paper, we present an improved algorithm for finding the local memory access sequence in computations involving regular sections of arrays with cyclic(k) distributions. After establishing the fact that regular section indices correspond to elements of an integer lattice, we show how to find a lattice basis that allows for simple and fast enumeration of memory accesses. The complexity of our algorithm is shown to be lower than that of the previous solution for the same problem. In addition, the experimental results demonstrate the efficiency of our method in practice.