A Generalized Processor Mapping Technique for Array Redistribution

Authors:
Ching-Hsien Hsu;Yeh-Ching Chung;Don-Lin Yang;Chyi-Ren Dow
Affiliations:
Chia Univ., Taichung, Taiwan;Chia Univ., Taichung, Taiwan;Chia Univ., Taichung, Taiwan;Chia Univ., Taichung, Taiwan
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2001

Citing 24
Cited 17

Computational frameworks for the fast Fourier transform

Computational frameworks for the fast Fourier transform
Dynamic data distributions in Vienna Fortran

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Generating communication for array statements: design, implementation, and evaluation

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
An approach to communication-efficient data redistribution

ICS '94 Proceedings of the 8th international conference on Supercomputing
Compilation techniques for block-cyclic distributions

ICS '94 Proceedings of the 8th international conference on Supercomputing
Generating local addresses and communication sets for data-parallel programs

Journal of Parallel and Distributed Computing
Optimization of array redistribution for distributed memory multicomputers

Parallel Computing
Processor Mapping Techniques Toward Efficient Data Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient address generation for block-cyclic distributions

ICS '95 Proceedings of the 9th international conference on Supercomputing
Handling block-cyclic distributed arrays in Vienna Fortran 90

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Compiling array expressions for efficient execution on distributed-memory machines

Journal of Parallel and Distributed Computing
Optimizations for efficient array redistribution on distributed memory multicomputers

Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Efficient index set generation for compiling HPF array statements on distributed-memory machines

Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Fast runtime block cyclic data redistribution on multiprocessors

Journal of Parallel and Distributed Computing
Scheduling Block-Cyclic Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for kr → r and r → kr Array Redistribution1

The Journal of Supercomputing
Efficient Algorithms for Array Redistribution

IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for Multi-dimensional Block-Cyclic Redistribution of Arrays

ICPP '97 Proceedings of the international Conference on Parallel Processing
Multi-phase array redistribution: modeling and evaluation

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
A New Approach to Array Redistribution: Strip Mining Redistribution

PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
Automatic generation of efficient array redistribution routines for distributed memory multicomputers

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Compiler Techniques for Determining Data Distribution and Generating Communication Sets on Distributed-Memory Machines

HICSS '96 Proceedings of the 29th Hawaii International Conference on System Sciences Volume 1: Software Technology and Architecture
Efficient Algorithms for Block-Cyclic Redistribution of Arrays

SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)

Sparse Matrix Block-Cyclic Realignment on Distributed Memory Machines

The Journal of Supercomputing
Improving communication scheduling for array redistribution

Journal of Parallel and Distributed Computing
An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

The Journal of Supercomputing
Messages Scheduling for Parallel Data Redistribution between Clusters

IEEE Transactions on Parallel and Distributed Systems
Optimizing Communications of Dynamic Data Redistribution on Symmetrical Matrices in Parallelizing Compilers

IEEE Transactions on Parallel and Distributed Systems
Scheduling contention-free irregular redistributions in parallelizing compilers

The Journal of Supercomputing
A flexible processor mapping technique toward data localization for block-cyclic data redistribution

The Journal of Supercomputing
A message passing strategy for array redistributions in a torus network

The Journal of Supercomputing
Message scheduling for array re-decomposition on distributed memory systems

Future Generation Computer Systems
A dominant input stream for LUD incremental computing on a contention network

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Contention-Free communication scheduling for irregular data redistribution in parallelizing compilers

APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Optimizing scheduling stability for runtime data alignment

EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing
Efficient communication scheduling methods for irregular data redistribution in parallelizing compilers

PaCT'05 Proceedings of the 8th international conference on Parallel Computing Technologies
Irregular redistribution scheduling by partitioning messages

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
ISO: comprehensive techniques toward efficient gen_block redistribution with multidimensional arrays

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Efficient multidimensional data redistribution for resizable parallel computations

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
On the complexity of the max-edge-coloring problem with its variants

ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many scientific applications, array redistribution is usually required to enhance data locality and reduce remote memory access in many parallel programs on distributed memory multicomputers. Since the redistribution is performed at runtime, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present a generalized processor mapping technique to minimize the amount of data exchange for BLOCK-CYCLIC(kr) to BLOCK-CYCLIC(r) array redistribution and vice versa. The main idea of the generalized processor mapping technique is first to develop mapping functions for computing a new rank of each destination processor. Based on the mapping functions, a new logical sequence of destination processors can be derived. The new logical processor sequence is then used to minimize the amount of data exchange in a redistribution. The generalized processor mapping technique can handle array redistribution with arbitrary source and destination processor sets and can be applied to multidimensional array redistribution. We present a theoretical model to analyze the performance improvement of the generalized processor mapping technique. To evaluate the performance of the proposed technique, we have implemented the generalized processor mapping technique on an IBM SP2 parallel machine. The experimental results show that the generalized processor mapping technique can provide performance improvement over a wide range of redistribution problems.