Optimizations for efficient array redistribution on distributed memory multicomputers
Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Scheduling Block-Cyclic Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Algorithmic Redistribution Methods for Block-Cyclic Decompositions
IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets
IEEE Transactions on Parallel and Distributed Systems
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Processor reordering algorithms toward efficient GEN_BLOCK redistribution
Proceedings of the 2001 ACM symposium on Applied computing
A Generalized Processor Mapping Technique for Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Scheduling GEN_BLOCK Array Redistribution
The Journal of Supercomputing
Multi-phase array redistribution: modeling and evaluation
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Symbolic Communication Set Generation for Irregular Parallel Applications
The Journal of Supercomputing
A Divide-and-Conquer Algorithm for Irregular Redistribution in Parallelizing Compilers
The Journal of Supercomputing
Message Scheduling for Irregular Data Redistribution in Parallelizing Compilers
IEICE - Transactions on Information and Systems
A Two-Level Scheduling Strategy for optimising communications of data parallel programs in clusters
International Journal of Ad Hoc and Ubiquitous Computing
Message clustering technique towards efficient irregular data redistribution in clusters and grids
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Hi-index | 0.00 |
Runtime data redistribution is usually required in parallel algorithms to enhance data locality, achieve dynamic load balancing and reduce remote data access on distributed memory multicomputers. In this paper, we present comprehensive techniques to implement GEN_BLOCK redistribution in parallelizing compilers, including Indexing schemes for communication sets generation, a contention-free communication Scheduling algorithm and an Optimization technique for improving communication efficiency. Both theoretical analysis and experimental results show that the proposed techniques can efficiently perform GEN_BLOCK data redistribution during runtime.