An approach to communication-efficient data redistribution
ICS '94 Proceedings of the 8th international conference on Supercomputing
Compilation techniques for block-cyclic distributions
ICS '94 Proceedings of the 8th international conference on Supercomputing
Generating local addresses and communication sets for data-parallel programs
Journal of Parallel and Distributed Computing
Optimization of array redistribution for distributed memory multicomputers
Parallel Computing
Processor Mapping Techniques Toward Efficient Data Redistribution
IEEE Transactions on Parallel and Distributed Systems
Compiling array expressions for efficient execution on distributed-memory machines
Journal of Parallel and Distributed Computing
Optimizations for efficient array redistribution on distributed memory multicomputers
Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Parallel Computing
Fast runtime block cyclic data redistribution on multiprocessors
Journal of Parallel and Distributed Computing
Scheduling Block-Cyclic Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Maximizing parallelism and minimizing synchronization with affine partitions
Parallel Computing - Special issues on languages and compilers for parallel computers
Efficient Methods for kr → r and r → kr Array Redistribution1
The Journal of Supercomputing
Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets
IEEE Transactions on Parallel and Distributed Systems
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Contention-free communication scheduling for array redistribution
Parallel Computing
A Generalized Processor Mapping Technique for Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Multi-phase array redistribution: modeling and evaluation
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Solving Alignment Using Elementary Linear Algebra
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
IEEE Transactions on Parallel and Distributed Systems
Improving communication scheduling for array redistribution
Journal of Parallel and Distributed Computing
The Journal of Supercomputing
Messages Scheduling for Parallel Data Redistribution between Clusters
IEEE Transactions on Parallel and Distributed Systems
Scheduling contention-free irregular redistributions in parallelizing compilers
The Journal of Supercomputing
Data distribution schemes of sparse arrays on distributed memory multicomputers
The Journal of Supercomputing
A Two-Level Scheduling Strategy for optimising communications of data parallel programs in clusters
International Journal of Ad Hoc and Ubiquitous Computing
A compound scheduling strategy for irregular array redistribution in cluster based parallel system
MTPP'10 Proceedings of the Second Russia-Taiwan conference on Methods and tools of parallel programming multicomputers
Globe'11 Proceedings of the 4th international conference on Data management in grid and peer-to-peer systems
Message clustering technique towards efficient irregular data redistribution in clusters and grids
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Hi-index | 0.00 |
Array redistribution is usually needed for more efficiently executing a data-parallel program on distributed memory multicomputers. To minimize the redistribution data transfer cost, processor mapping techniques were proposed to reduce the amount of redistributed data elements. Theses techniques demand that the beginning data elements on a processor not be redistributed in the redistribution. On the other hand, for satisfying practical computation needs, a programmer may require other data elements to be un-redistributed (localized) in the redistribution. In this paper, we propose a flexible processor mapping technique for the Block-Cyclic redistribution to allow the programmer to localize the required data elements in the redistribution. We also present an efficient redistribution method for the redistribution employing our proposed technique. The data transfer cost reduction and system performance improvement for the redistributions with data localization are analyzed and presented in our experimental results.