Revisiting rendezvous protocols in the context of RDMA-capable host channel adapters and many-core processors

Authors:
Masamichi Takagi;Yuichi Nakamura;Atsushi Hori;Balazs Gerofi;Yutaka Ishikawa
Affiliations:
Green Platform Research Lab., NEC Corp.;Green Platform Research Lab., NEC Corp.;RIKEN Advanced Institute for Computational Science;Univ. of Tokyo;Univ. of Tokyo
Venue:
Proceedings of the 20th European MPI Users' Group Meeting
Year:
2013

Citing 7
Cited 0

A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
High performance RDMA-based MPI implementation over InfiniBand

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Improving Communication Progress and Overlap in MPI Rendezvous Protocol over RDMA-enabled Interconnects

HPCS '08 Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications
Maximizing MPI point-to-point communication performance on RDMA-enabled clusters with customized protocols

Proceedings of the 23rd international conference on Supercomputing
Near-Optimal Rendezvous Protocols for RDMA-Enabled Clusters

ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Optimization principles for collective neighborhood communications

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We revisit RDMA-based rendezvous protocols in MPI in the context of cluster computer with RDMA-capable HCA and many-core processors, and propose two improved protocols. The conventional sender-initiate rendezvous protocols cause costly processor-device communications via PCI bus on detecting completion of RDMA transfer. The conventional receiver-initiate rendezvous protocols need to send extra control messages when a value of the memory-slot to poll in the receive buffer has the same value as the send buffer. The first proposed protocol implements polling on a memory-slot in the receive buffer to eliminate the processor-device communications. The second proposed protocol randomizes the value of the memory-slot to poll to reduce extra control messages. We have evaluated the proposed protocols using micro-benchmarks and NAS Parallel Benchmarks. One of the proposed protocols has a benefit compared to the conventional protocols. And the second proposed protocol reduces the execution time by up to 11.14% compared to the first protocol.