The high performance Fortran handbook
The high performance Fortran handbook
Compilation techniques for block-cyclic distributions
ICS '94 Proceedings of the 8th international conference on Supercomputing
Efficient algorithms for all-to-all communications in multi-port message-passing systems
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Efficient Algorithms for Block-Cyclic Redistribution of Arrays
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Communication issues in heterogeneous embedded systems
WPDRTS '96 Proceedings of the 4th International Workshop on Parallel and Distributed Real-Time Systems
LAPACK Working Note 95: ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers -- Design Issues and Performance
A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution
IEEE Transactions on Parallel and Distributed Systems
Efficient Methods for kr → r and r → kr Array Redistribution1
The Journal of Supercomputing
Efficient Methods for Multi-Dimensional Array Redistribution
The Journal of Supercomputing
A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
A Generalized Processor Mapping Technique for Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers
The Journal of Supercomputing
Optimizing Data Scheduling on Processor-In-Memory Arrays
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A Compressed Diagonals Remapping Technique for Dynamic Data Redistribution on Banded Sparse Matrix
The Journal of Supercomputing
Improving communication scheduling for array redistribution
Journal of Parallel and Distributed Computing
A pipeline technique for dynamic data transfer on a multiprocessor grid
International Journal of Parallel Programming
IEEE Transactions on Parallel and Distributed Systems
International Journal of Computer Mathematics
Message scheduling for array re-decomposition on distributed memory systems
Future Generation Computer Systems
Efficient multidimensional data redistribution for resizable parallel computations
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
We present a uniform framework for a classical problem, redistribution of a multi-dimensional array. Using a generalized circulant matrix formalism, we derive eficient direct, indirect a,nd hybrid contention-free communication schedules. Our indirect schedule reduces the number of communication steps significantly compared with the previous approaches. Our approach exploits the regularity of the block-cyclic redistribution to minimize the index computation overheads. For the case of 2-d redistribution, when the block size increases by factors Of K1 and K2 along each dimension and the process topology remains fixed, our indirect schedule performs the redistribution in O(lOg(K1K2)) communication steps. For the case of fixed block size and the processor topology is transposed, our indirect schedule results in O(log(L/G)) communication Step. Implementations of our algorithms on the IBM SP-2 show superior performance over previous approaches.