High performance RDMA-based MPI implementation over InfiniBand
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
HPCS '08 Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications
Proceedings of the 23rd international conference on Supercomputing
Near-Optimal Rendezvous Protocols for RDMA-Enabled Clusters
ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Optimization principles for collective neighborhood communications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
We revisit RDMA-based rendezvous protocols in MPI in the context of cluster computer with RDMA-capable HCA and many-core processors, and propose two improved protocols. The conventional sender-initiate rendezvous protocols cause costly processor-device communications via PCI bus on detecting completion of RDMA transfer. The conventional receiver-initiate rendezvous protocols need to send extra control messages when a value of the memory-slot to poll in the receive buffer has the same value as the send buffer. The first proposed protocol implements polling on a memory-slot in the receive buffer to eliminate the processor-device communications. The second proposed protocol randomizes the value of the memory-slot to poll to reduce extra control messages. We have evaluated the proposed protocols using micro-benchmarks and NAS Parallel Benchmarks. One of the proposed protocols has a benefit compared to the conventional protocols. And the second proposed protocol reduces the execution time by up to 11.14% compared to the first protocol.