Generating local addresses and communication sets for data-parallel programs

Authors:
Siddhartha Chatterjee;John R. Gilbert;Fred J. E. Long;Robert Schreiber;Shang-Hua Teng
Affiliations:
Research Institute for Advanced Computer Science (RIACS), NASA Ames Research Center, Moffett Field, CA;Xerox Palo Alto Research Center, Palo Alto, CA;Univ. of California, Santa Cruz;Research Institute for Advanced Computer Science (RIACS, NASA Ames Research Center, Moffett Field, CA;Massachusetts Instute of Technology, Cambridge
Venue:
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1993

Citing 4
Cited 14

Principles of runtime support for parallel processors

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Compile-time generation of regular communications patterns

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Performance of various computers using standard linear equations software

ACM SIGARCH Computer Architecture News
Compiling Global Name-Space Parallel Loops for Distributed Execution

IEEE Transactions on Parallel and Distributed Systems

An approach to communication-efficient data redistribution

ICS '94 Proceedings of the 8th international conference on Supercomputing
Compilation techniques for block-cyclic distributions

ICS '94 Proceedings of the 8th international conference on Supercomputing
A linear-time algorithm for computing the memory access sequence in data-parallel programs

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient address generation for block-cyclic distributions

ICS '95 Proceedings of the 9th international conference on Supercomputing
Handling block-cyclic distributed arrays in Vienna Fortran 90

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
An efficient uniform run-time scheme for mixed regular-irregular applications

ICS '98 Proceedings of the 12th international conference on Supercomputing
A task- and data-parallel programming language based on shared objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications

IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Exploiting Ownership Sets in HPF

LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
An Expression-Rewriting Framework to Generate Communication Sets for HPF Programs with Block-Cyclic Distribution

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
An efficient algorithm for communication set generation of data parallel programs with block-cyclic distribution

Parallel Computing
Opus: A Coordination Language for Multidisciplinary Applications

Scientific Programming
Compiler optimization to improve data locality for processor multithreading

Scientific Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Generating local addresses and communication sets is an important issue in distributed-memory implementations of data-parallel languages such as High Performance Fortran. We show that for an array A affinely aligned to a template that is distributed across p processors with a cyclic(k) distribution, and a computation involving the regular section A(l:h:s), the local memory access sequence for any processor is characterized by a finite state machine of at most k states. We present fast algorithms for computing the essential information about these state machines, and extend the framework to handle multidimensional arrays. We also show how to generate communication sets using the state machine approach. Performance results show that this solution requires very little runtime overhead and acceptable preprocessing time.