A New DMA Registration Strategy for Pinning-Based High Performance Networks

Authors:
Christian Bell;Dan Bonachea
Affiliations:
-;-
Venue:
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Year:
2003

Citing 0
Cited 13

RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel Languages and Compilers: Perspective From the Titanium Experience

International Journal of High Performance Computing Applications
Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations

International Journal of High Performance Computing and Networking
Optimisation and performance evaluation of mechanisms for latency tolerance in remote memory access communication on clusters

International Journal of High Performance Computing and Networking
An efficient design for fast memory registration in RDMA

Journal of Network and Computer Applications
Optimizing bandwidth limited problems using one-sided communication and overlap

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Scalable memory registration for high performance networks using helper threads

Proceedings of the 8th ACM International Conference on Computing Frontiers
Asynchronous PGAS runtime for Myrinet networks

Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Unifying UPC and MPI runtimes: experience with MVAPICH

Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
A high performance superpipeline protocol for infiniband

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Designing a common communication subsystem

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Analysis of the memory registration process in the mellanox infiniband software stack

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
RDMA in the SiCortex cluster systems

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new memory registration strategy for supporting Remote DMA (RDMA) operations over pinning-based networks, as existing approaches are insufficient for efficiently implementing Global Address Space (GAS) languages. Although existing approaches often maximize bandwidth, they require levels of synchronization that discourage one-sided communication, and can have significant latency costs for small messages. The proposed Firehose algorithm attempts to expose one-sided, zero-copy communication as a common case, while minimizing the number of host-level synchronizations required to support remote memory operations. The basic idea is to reap the performance benefits of a Pin-Everythingapproach in the common case (without the drawbacks) and revert to a Rendezvous-based approach to handle the uncommon case. In all cases, the algorithm attempts to amortize the cost of synchronization and pinning over multiple remote memory operations, improving performance over Rendezvous by avoiding many handshaking messages and the cost of re-pinning recently used pages. Performance results are presented which demonstrate that the cost of two-sided handshaking and memory registration is negligible when the set of remotely referenced memory pages on a given node is smaller than the physical memory (where the entire working set can remain pinned), and for applications with larger working sets theperformance degrades gracefully and consistently outperforms conventional approaches.