Practical parallel algorithms for personalized communication and integer sorting
Journal of Experimental Algorithmics (JEA)
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Portable and scalable algorithm for irregular all-to-all communication
Journal of Parallel and Distributed Computing
Parallel Domain Decomposition and Load Balancing Using Space-Filling Curves
HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
Interprocessor Communication with Limited Memory
IEEE Transactions on Parallel and Distributed Systems
Optimization of MPI collective communication on BlueGene/L systems
Proceedings of the 19th annual international conference on Supercomputing
MADRE: The Memory-Aware Data Redistribution Engine
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Sparse collective operations for MPI
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Blue matter: strong scaling of molecular dynamics on blue gene/l
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Library support for parallel sorting in scientific computations
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
An in-place algorithm for irregular all-to-all communication with limited memory
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
In-place algorithms for the symmetric all-to-all exchange with MPI
Proceedings of the 20th European MPI Users' Group Meeting
Optimizing Memory Constrained Environments in Monte Carlo Nuclear Reactor Simulations
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
This paper proposes a new fine-grained data distribution operation MPI_Alltoall_specific that allows an element-wise distribution of data elements to specific target processes. This operation can be used to implement irregular data distribution operations that are required, for example, in particle codes. We present different implementation variants for MPI_Alltoall_specific which are based on collective MPI operations, on point-to-point communication operations, or on parallel sorting. The properties of the implementation variants are discussed and performance results with different data sets are presented. For the performance results two high scaling hardware platforms, including a Blue Gene/P system, are used.