Processor Mapping Techniques Toward Efficient Data Redistribution

Authors:
Edgar T. Kalns;Lionel M. Ni
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1995

Citing 6
Cited 35

The DINO parallel programming language

Journal of Parallel and Distributed Computing
Interprocedural compilation of Fortran D for MIMD distributed-memory machines

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Dynamic data distributions in Vienna Fortran

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Generating communication for array statements: design, implementation, and evaluation

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
An approach to communication-efficient data redistribution

ICS '94 Proceedings of the 8th international conference on Supercomputing
Processor Mapping Techniques Toward Efficient Data Redistribution

Proceedings of the 8th International Symposium on Parallel Processing

Scheduling Block-Cyclic Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for kr → r and r → kr Array Redistribution1

The Journal of Supercomputing
Algorithmic Redistribution Methods for Block-Cyclic Decompositions

IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for Multi-Dimensional Array Redistribution

The Journal of Supercomputing
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
A Generalized Processor Mapping Technique for Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers

The Journal of Supercomputing
Generating communication sets of array assignment statements for block-cyclic distribution on distributed memory parallel computers

Parallel Computing
Message Encoding Techniques for Efficient Arrary Redistribution

ICPP '97 Proceedings of the international Conference on Parallel Processing
Efficient Method for kr-r and r-kr Arrary Redistribution

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
A Compressed Diagonals Remapping Technique for Dynamic Data Redistribution on Banded Sparse Matrix

The Journal of Supercomputing
A Divide-and-Conquer Algorithm for Irregular Redistribution in Parallelizing Compilers

The Journal of Supercomputing
Sparse Matrix Block-Cyclic Realignment on Distributed Memory Machines

The Journal of Supercomputing
Improving communication scheduling for array redistribution

Journal of Parallel and Distributed Computing
A pipeline technique for dynamic data transfer on a multiprocessor grid

International Journal of Parallel Programming
An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

The Journal of Supercomputing
Optimizing Communications of Dynamic Data Redistribution on Symmetrical Matrices in Parallelizing Compilers

IEEE Transactions on Parallel and Distributed Systems
Scheduling contention-free irregular redistributions in parallelizing compilers

The Journal of Supercomputing
A flexible processor mapping technique toward data localization for block-cyclic data redistribution

The Journal of Supercomputing
A message passing strategy for array redistributions in a torus network

The Journal of Supercomputing
A message combining approach for efficient array redistribution in non-all-to-all communication networks

International Journal of Computer Mathematics
Message scheduling for array re-decomposition on distributed memory systems

Future Generation Computer Systems
A compressed diagonals remapping technique for dynamic data redistribution on banded sparse matrix

ISPA'03 Proceedings of the 2003 international conference on Parallel and distributed processing and applications
A dominant input stream for LUD incremental computing on a contention network

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Optimizing communications of data parallel programs in scalable cluster systems

GPC'08 Proceedings of the 3rd international conference on Advances in grid and pervasive computing
Localization techniques for cluster-based data grid

ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
Optimal processor mapping scheme for efficient communication of data realignment

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Efficient communication scheduling methods for irregular data redistribution in parallelizing compilers

PaCT'05 Proceedings of the 8th international conference on Parallel Computing Technologies
Localized communications of data parallel programs on multi-cluster grid systems

EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
Irregular redistribution scheduling by partitioning messages

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Optimizations of data distribution localities in cluster grid environments

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part IV
Efficient selection strategies towards processor reordering techniques for improving data locality in heterogeneous clusters

The Journal of Supercomputing
Efficient multidimensional data redistribution for resizable parallel computations

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
On the complexity of the max-edge-coloring problem with its variants

ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Run-time data redistribution can enhance algorithm performance in distributed-memory machines. Explicit redistribution of data can be performed between algorithm phases when a different data decomposition is expected to deliver increased performance for a subsequent phase of computation. Redistribution, however, represents increased program overhead as algorithm computation is discontinued while data are exchanged among processor memories. In this paper, we present a technique that minimizes the amount of data exchange for BLOCK to CYCLIC(c) (or vice-versa) redistributions of arbitrary number of dimensions. Preserving the semantics of the target (destination) distribution pattern, the technique manipulates the data to logical processor mapping of the target pattern. When implemented on an IBM SP, the mapping technique demonstrates redistribution performance improvements of approximately 40% over traditional data to processor mapping. Relative to the traditional mapping technique, the proposed method affords greater flexibility in specifying precisely which data elements are redistributed and which elements remain on-processor.