Efficient Algorithms for Array Redistribution

Authors:
Rajeev Thakur;Alok Choudhary;J. Ramanujam
Affiliations:
-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1996

Citing 13
Cited 33

Compile-time generation of regular communications patterns

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Scheduling regular and irregular communication patterns on the CM-5

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Generating local addresses and communication sets for data-parallel programs

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The high performance Fortran handbook

The high performance Fortran handbook
Generating communication for array statements: design, implementation, and evaluation

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
An approach to communication-efficient data redistribution

ICS '94 Proceedings of the 8th international conference on Supercomputing
Complete exchange on the CM-5 and Touchstone Delta

The Journal of Supercomputing
Irregular personalized communication on distributed memory machines

Journal of Parallel and Distributed Computing
Processor Mapping Techniques Toward Efficient Data Redistribution

Proceedings of the 8th International Symposium on Parallel Processing
All-to-All Communication on Meshes with Wormhole Routing

Proceedings of the 8th International Symposium on Parallel Processing
Multi-phase array redistribution: modeling and evaluation

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
A New Approach to Array Redistribution: Strip Mining Redistribution

PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
Automatic generation of efficient array redistribution routines for distributed memory multicomputers

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)

A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for kr → r and r → kr Array Redistribution1

The Journal of Supercomputing
Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets

IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for Multi-Dimensional Array Redistribution

The Journal of Supercomputing
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Integer lattice based methods for local address generation for block-cyclic distributions

Compiler optimizations for scalable parallel systems
Compiler-Directed Collective-I/O

IEEE Transactions on Parallel and Distributed Systems
A Generalized Processor Mapping Technique for Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient algorithms for block-cyclic array redistribution between processor sets

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers

The Journal of Supercomputing
Distribution Assignment Placement: Effective Optimization of Redistribution Costs

IEEE Transactions on Parallel and Distributed Systems
Block-cyclic redistribution over heterogeneous networks

Cluster Computing
Scheduling GEN_BLOCK Array Redistribution

The Journal of Supercomputing
Generating communication sets of array assignment statements for block-cyclic distribution on distributed memory parallel computers

Parallel Computing
Message Encoding Techniques for Efficient Arrary Redistribution

ICPP '97 Proceedings of the international Conference on Parallel Processing
Efficient Method for kr-r and r-kr Arrary Redistribution

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Suboptimal Communication Schedule for GEN_BLOCK Redistribution (Best Student Paper Award: Honourable Mention)

VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Optimizing Data Scheduling on Processor-In-Memory Arrays

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A Compressed Diagonals Remapping Technique for Dynamic Data Redistribution on Banded Sparse Matrix

The Journal of Supercomputing
Sparse Matrix Block-Cyclic Realignment on Distributed Memory Machines

The Journal of Supercomputing
Improving communication scheduling for array redistribution

Journal of Parallel and Distributed Computing
Banked scratch-pad memory management for reducing leakage energy consumption

Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
A pipeline technique for dynamic data transfer on a multiprocessor grid

International Journal of Parallel Programming
An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

The Journal of Supercomputing
Optimizing Communications of Dynamic Data Redistribution on Symmetrical Matrices in Parallelizing Compilers

IEEE Transactions on Parallel and Distributed Systems
A flexible processor mapping technique toward data localization for block-cyclic data redistribution

The Journal of Supercomputing
A message passing strategy for array redistributions in a torus network

The Journal of Supercomputing
A message combining approach for efficient array redistribution in non-all-to-all communication networks

International Journal of Computer Mathematics
A compressed diagonals remapping technique for dynamic data redistribution on banded sparse matrix

ISPA'03 Proceedings of the 2003 international conference on Parallel and distributed processing and applications
An in-place algorithm for irregular all-to-all communication with limited memory

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Compiler-guided leakage optimization for banked scratch-pad memories

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Efficient multidimensional data redistribution for resizable parallel computations

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dynamic redistribution of arrays is required very often in programs on distributed memory parallel computers. This paper presents efficient algorithms for redistribution between different cyclic(k) distributions, as defined in High Performance Fortran. We first propose special optimized algorithms for a cyclic(x) to cyclic(y) redistribution when x is a multiple of y, or y is a multiple of x. We then propose two algorithms, called the GCD method and the LCM method, for the general cyclic(x) to cyclic(y) redistribution when there is no particular relation between x and y. We have implemented these algorithms on the Intel Touchstone Delta, and find that they perform well for different array sizes and number of processors.