Automatic generation of efficient array redistribution routines for distributed memory multicomputers

Authors:
S. Ramasulamy;P. Banerjee
Affiliations:
-;-
Venue:
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Year:
1995

Citing 0
Cited 29

A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers

IEEE Transactions on Parallel and Distributed Systems
A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for kr → r and r → kr Array Redistribution1

The Journal of Supercomputing
Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets

IEEE Transactions on Parallel and Distributed Systems
Coordinating HPF programs to mix task and data parallelism

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 1
A Transformation Approach to Derive Efficient Parallel Implementations

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Efficient Methods for Multi-Dimensional Array Redistribution

The Journal of Supercomputing
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Double standards: bringing task parallelism to HPF via the message passing interface

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Compiler optimization of dynamic data distributions for distributed-memory multicomputers

Compiler optimizations for scalable parallel systems
A Generalized Processor Mapping Technique for Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient algorithms for block-cyclic array redistribution between processor sets

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Block-cyclic redistribution over heterogeneous networks

Cluster Computing
Mixed data and task parallelism with HPF and PVM

Cluster Computing
Efficient Algorithms for Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
PACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Mapping Functions and Data Redistribution for Parallel Files

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Compiling MATLAB Programs to ScaLAPACK: Exploiting Task and Data Parallelism

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Efficient Method for kr-r and r-kr Arrary Redistribution

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Exploiting Advanced Task Parallelism in High Performance Fortran via a Task Library

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A Coordination Layer for Exploiting Task Parallelism with HPF

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
The Globus Striped GridFTP Framework and Server

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A pipeline technique for dynamic data transfer on a multiprocessor grid

International Journal of Parallel Programming
An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

The Journal of Supercomputing
pMatlab Parallel Matlab Library

International Journal of High Performance Computing Applications
A flexible processor mapping technique toward data localization for block-cyclic data redistribution

The Journal of Supercomputing
A message passing strategy for array redistributions in a torus network

The Journal of Supercomputing
Mapping functions and data redistribution for parallel files

The Journal of Supercomputing
Hogs and slackers: Using operations balance in a genetic algorithm to optimize sparse algebra computation on distributed architectures

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Appropriate data distribution has been found to be critical for obtaining good performance on Distributed Memory Multicomputers like the CM-5, Intel Paragon and IBM SP-1. It has also been found that some programs need to change their distributions during execution for better performance (redistribution). This work focuses on automatically generating efficient routines for redistribution. We present a new mathematical representation for regular distributions called PITFALLS and then discuss algorithms for redistribution based on this representation. A significant contribution of this work is the ability to handle arbitrary source and target processor sets while performing redistribution; another is the ability to handle arbitrary dimensionality for the array being redistributed in a sealable manner. The results presented show low overheads for our redistribution algorithm as compared to naive runtime methods.