A speculative and adaptive MPI rendezvous protocol over RDMA-enabled interconnects

Authors:
Mohammad J. Rashti;Ahmad Afsahi
Affiliations:
Department of Electrical and Computer Engineering, Queen's University, Kingston, ON, Canada;Department of Electrical and Computer Engineering, Queen's University, Kingston, ON, Canada
Venue:
International Journal of Parallel Programming
Year:
2009

Citing 18
Cited 2

Performance Evaluation of the Quadrics Interconnection Network

Cluster Computing
Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Message passing and shared address space parallelism on an SMP cluster

Parallel Computing
An MPI Library which uses Polling, Interrupts and Remote Copying for the Fujitsu AP1000+

ISPAN '96 Proceedings of the 1996 International Symposium on Parallel Architectures, Algorithms and Networks
An Evaluation of the Myrinet/GM2 Two-Port Networks

LCN '04 Proceedings of the 29th Annual IEEE International Conference on Local Computer Networks
Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Analyzing the Impact of Overlap, Offload, and Independent Progress for Message Passing Interface Applications

International Journal of High Performance Computing Applications
QsNetII: Defining High-Performance Network Design

IEEE Micro
A comparison of 4X InfiniBand and Quadrics Elan-4 technologies

CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
NIC-based offload of dynamic user-defined modules for Myrinet clusters

CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Implementation and design analysis of a network messaging module using virtual interface architecture

CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Assessing the Ability of Computation/Communication Overlap and Communication Progress in Modern Interconnects

HOTI '07 Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects
Cell broadband engine architecture and its first implementation: a performance view

IBM Journal of Research and Development
Improving Communication Progress and Overlap in MPI Rendezvous Protocol over RDMA-enabled Interconnects

HPCS '08 Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications
An automated approach to improve communication-computation overlap in clusters

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Measuring MPI send and receive overhead and application availability in high performance network interfaces

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Improving reactivity and communication overlap in MPI using a generic I/O manager

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface

Using triggered operations to offload rendezvous messages

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
A fast and resource-conscious MPI message queue mechanism for large-scale jobs

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Overlapping computation with communication is a key technique to conceal the effect of communication latency on the performance of parallel applications. Message Passing Interface (MPI) is a widely used message passing standard for high performance computing. One of the most important factors in achieving a good level of overlap is the MPI ability to make progress on outstanding communication operations. In this paper, we propose a novel speculative MPI Rendezvous protocol that uses RDMA Read and RDMA Write to effectively improve communication progress and consequently the overlap ability. Performance results based on a modified MPICH2 implementation over 10-Gigabit iWARP Ethernet reveal a significant (80-100%) improvement in receiver side overlap and progress ability. We have also observed up to 30% improvement in application wait time for some NPB applications as well as the RADIX application. For applications that do not benefit from this protocol, an adaptation mechanism is used to stop the speculation to effectively reduce the protocol overhead.