Optimization of array redistribution for distributed memory multicomputers
Parallel Computing
Processor Mapping Techniques Toward Efficient Data Redistribution
IEEE Transactions on Parallel and Distributed Systems
Optimizations for efficient array redistribution on distributed memory multicomputers
Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Fast runtime block cyclic data redistribution on multiprocessors
Journal of Parallel and Distributed Computing
Scheduling Block-Cyclic Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Algorithmic Redistribution Methods for Block-Cyclic Decompositions
IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets
IEEE Transactions on Parallel and Distributed Systems
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Contention-free communication scheduling for array redistribution
Parallel Computing
Processor reordering algorithms toward efficient GEN_BLOCK redistribution
Proceedings of the 2001 ACM symposium on Applied computing
A Generalized Processor Mapping Technique for Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers
The Journal of Supercomputing
Sparse Matrix Block-Cyclic Redistribution
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Multi-phase array redistribution: modeling and evaluation
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Symbolic Communication Set Generation for Irregular Parallel Applications
The Journal of Supercomputing
A Divide-and-Conquer Algorithm for Irregular Redistribution in Parallelizing Compilers
The Journal of Supercomputing
Message Scheduling for Irregular Data Redistribution in Parallelizing Compilers
IEICE - Transactions on Information and Systems
A flexible processor mapping technique toward data localization for block-cyclic data redistribution
The Journal of Supercomputing
International Journal of Ad Hoc and Ubiquitous Computing
A Two-Level Scheduling Strategy for optimising communications of data parallel programs in clusters
International Journal of Ad Hoc and Ubiquitous Computing
A compound scheduling strategy for irregular array redistribution in cluster based parallel system
MTPP'10 Proceedings of the Second Russia-Taiwan conference on Methods and tools of parallel programming multicomputers
Globe'11 Proceedings of the 4th international conference on Data management in grid and peer-to-peer systems
Message clustering technique towards efficient irregular data redistribution in clusters and grids
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
An improved partitioning mechanism for optimizing massive data analysis using MapReduce
The Journal of Supercomputing
Hi-index | 0.00 |
Irregular array redistribution has been paid attention recently since it can distribute different size of data segment to heterogeneous processors according to their computational ability. It's also the reason why it has been kept an eye on load balance. High Performance Fortran Version 2 (HPF2) provides GEN_BLOCK distribution format which facilitates generalized block distributions. In this paper, we present a two-phase degree-reduction (TPDR) method for scheduling HPF2 irregular array redistribution. Using a bipartite communication graph, the first phase of TPDR schedules communication links adjacent to processors that with degree greater than two. A communication step will be scheduled follow each degree-reduction iteration. The second phase of TPDR schedules remaining messages of all processors that with degree-2 and degree-1 using an adjustable coloring mechanism. An extended algorithm based on TPDR is also presented in this paper. Effectiveness of the proposed methods not only avoids node contention but also shortens the overall communication cost. The proposed methods are also practicable due to low algorithmic complexity. To evaluate the performance of our methods, we have implemented both algorithms along with the divide-and-conquer algorithm and two scheduling mechanism. The simulation results show improvement of total communication costs.