Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
Fast Collective Operations Using Shared and Remote Memory Access Protocols on Clusters
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Efficient and Scalable All-to-All Personalized Exchange for InfiniBand-Based Clusters
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Data Transfers between Processes in an SMP System: Performance Study and Application to MPI
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
A study of process arrival patterns for MPI collective operations
International Journal of Parallel Programming
Hi-index | 0.00 |
Recent studies have shown that processes in real applications can arrive at the collective calls at different times. This imbalanced process arrival pattern can significantly affect the performance of the collective operations. MPI_Alltoall() is a communication-intensive collective operation that is used in many parallel scientific applications. Its efficient implementation under different process arrival patterns is critical to the performance of applications that use them frequently. In this paper, we propose novel RDMA-based process arrival pattern aware MPI_Alltoall() algorithms over InfiniBand clusters. We extend the algorithms to be shared memory aware for small to medium size messages. The micro-benchmark and application results indicate that the proposed algorithms outperform the native implementation as well as their non-process arrival pattern aware counterparts when processes arrive at different times.