High performance RDMA protocols in HPC

Authors:
Tim S. Woodall;Galen M. Shipman;George Bosilca;Richard L. Graham;Arthur B. Maccabe
Affiliations:
Los Alamos National Laboratory, Advanced Computing Laboratory;Los Alamos National Laboratory, Advanced Computing Laboratory;Dept. of Computer Science, University of Tennessee;Los Alamos National Laboratory, Advanced Computing Laboratory;Dept. of Computer Science, University of New Mexico
Venue:
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Year:
2006

Citing 4
Cited 9

MPI: a message passing interface

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
MPI-2: Extending the Message-Passing Interface

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
High performance RDMA-based MPI implementation over InfiniBand

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A study of iSCSI extensions for RDMA (iSER)

NICELI '03 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications

Gravel: A Communication Library to Fast Path MPI

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Performance Issues of Synchronisation in the MPI-2 One-Sided Communication API

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Locality and topology aware intra-node communication among multicore CPUs

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Scalable memory registration for high performance networks using helper threads

Proceedings of the 8th ACM International Conference on Computing Frontiers
A high performance superpipeline protocol for infiniband

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Optimization principles for collective neighborhood communications

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Investigations on InfiniBand: efficient network buffer utilization at scale

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
RDMA in the SiCortex cluster systems

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Enabling highly-scalable remote memory access programming with MPI-3 one sided

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern network communication libraries that leverage Remote Directory Memory Access (RDMA) and OS bypass protocols, such as Infiniband [2] and Myrinet [10] can offer significant performance advantages over conventional send/receive protocols. However, this performance often comes with hidden per buffer setup costs [4]. This paper describes a unique long-message MPI [9] library ‘pipeline' protocol that addresses these constraints while avoiding some of the pitfalls of existing techniques. By using portable send/receive semantics to hide the cost of initializing the pipeline algorithm, and then effectively overlapping the cost of memory registration with RDMA operations, this protocol provides very good performance for any large-memory usage pattern. This approach avoids the use of non-portable memory hooks or keeping registered memory from being returned to the OS. Through this approach, bandwidth may be increased up to 67% when memory buffers are not effectively reused while providing superior performance in the effective bandwidth benchmark. Several user level protocols are explored using Open MPI's PML (Point to point messaging layer) and compared/contrasted to this ‘pipeline' protocol.