An Implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems

Authors:
Kees van Reeuwijk;Will Denissen;Henk J. Sips;Edwin M. R. M. Paalvast
Affiliations:
Delft Univ. of Technology, Delft, The Netherlands;TNO-TPD, Delft, The Netherlands;Delft Univ. of Technology, Delft, The Netherlands;TNO-TPD, Delft, The Netherlands
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1996

Citing 21
Cited 15

Process decomposition through locality of reference

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Supercompilers for parallel and vector computers

Supercompilers for parallel and vector computers
Compile-time generation of regular communications patterns

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Booster: a high-level language for portable parallel algorithms

Selected papers from the symposia on CWI-IMACS symposia on parallel scientific computing
Runtime compilation techniques for data partitioning and communication schedule reuse

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Generating communication for array statements: design, implementation, and evaluation

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Compilation techniques for block-cyclic distributions

ICS '94 Proceedings of the 8th international conference on Supercomputing
An optimizing Fortran D compiler for MIMD distributed-memory machines

An optimizing Fortran D compiler for MIMD distributed-memory machines
Generating local addresses and communication sets for data-parallel programs

Journal of Parallel and Distributed Computing
A linear-time algorithm for computing the memory access sequence in data-parallel programs

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient address generation for block-cyclic distributions

ICS '95 Proceedings of the 9th international conference on Supercomputing
A method for parallel program generation with an application to the Booster language

ICS '90 Proceedings of the 4th international conference on Supercomputing
Array distribution in SUPERB

ICS '89 Proceedings of the 3rd international conference on Supercomputing
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Compiling Global Name-Space Parallel Loops for Distributed Execution

IEEE Transactions on Parallel and Distributed Systems
Processing irregular codes containing arrays with multi-dimensional distributions by the PREPARE HPF compiler

HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Compiling Array Statements for Efficient Execution on Distributed-Memory Machines: Two-Level Mappings

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Fast Address Sequence Generation for Data-Parallel Programs Using Integer Lattices

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Processing Array Statements and Procedure Interfaces in the PREPARE High Performance Fortran Compiler

CC '94 Proceedings of the 5th International Conference on Compiler Construction
Incremental Generation of Index Sets for Array Statement Execution on Distributed-Memory Machines

LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Efficient Compilation of Array Statements for Private Memory Multicomputers

Efficient Compilation of Array Statements for Private Memory Multicomputers

Analysis of local enumeration and storage schemes in HPF

ICS '96 Proceedings of the 10th international conference on Supercomputing
Scheduling Block-Cyclic Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Using integer sets for data-parallel program analysis and optimization

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A task- and data-parallel programming language based on shared objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Communication Generation for Aligned and Cyclic(K) Distributions Using Integer Lattice

IEEE Transactions on Parallel and Distributed Systems
Advanced code generation for high performance Fortran

Compiler optimizations for scalable parallel systems
Integer lattice based methods for local address generation for block-cyclic distributions

Compiler optimizations for scalable parallel systems
ENSEMBLE: A Communication Layer for Embedded Multi-Processor Systems

OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Generating communication sets of array assignment statements for block-cyclic distribution on distributed memory parallel computers

Parallel Computing
Efficient communication sets generation for block-cyclic distribution on distributed-memory machines

Journal of Systems Architecture: the EUROMICRO Journal
An Expression-Rewriting Framework to Generate Communication Sets for HPF Programs with Block-Cyclic Distribution

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Local Enumeration Techniques for Sparse Algorithms

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
References

Sourcebook of parallel computing
Optimizing array reference checking in Java programs

IBM Systems Journal
Automatic array partitioning based on the Smith normal form

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data parallel languages, like High Performance Fortran (HPF), support the notion of distributed arrays. However, the implementation of such distributed array structures and their access on message passing computers is not straightforward. This holds especially for distributed arrays that are aligned to each other and given a block-cyclic distribution.In this paper, an implementation framework is presented for HPF distributed arrays on message passing computers. Methods are presented for efficient (in space and time) local index enumeration, local storage, and communication.Techniques for local set enumeration provide the basis for constructing local iteration sets and communication sets. It is shown that both local set enumeration and local storage schemes can be derived from the same equation. Local set enumeration and local storage schemes are shown to be orthogonal, i.e., they can be freely combined. Moreover, for linear access sequences generated by our enumeration methods, the local address calculations can be moved out of the enumeration loop, yielding efficient local memory address generation.The local set enumeration methods are implemented by using a relatively simple general transformation rule for absorbing ownership tests. This transformation rule can be repeatedly applied to absorb multiple ownership tests. Performance figures are presented for local iteration overhead, a simple communication pattern, and storage efficiency.