X-SRQ - Improving Scalability and Performance of Multi-core InfiniBand Clusters

Authors:
Galen M. Shipman;Stephen Poole;Pavel Shamis;Ishai Rabinovitz
Affiliations:
Oak Ridge National Laboratory, Oak Ridge, TN, USA;Oak Ridge National Laboratory, Oak Ridge, TN, USA;Mellanox Technologies, Yokneam, Israel;Mellanox Technologies, Yokneam, Israel
Venue:
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2008

Citing 4
Cited 0

SKaMPI: A Detailed, Accurate MPI Benchmark

Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters

Proceedings of the 21st annual international conference on Supercomputing
Infiniband scalability in open MPI

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Investigations on InfiniBand: efficient network buffer utilization at scale

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

To improve the scalability of InfiniBand on large scale clusters Open MPI introduced a protocol known as B-SRQ[2]. This protocol was shown to provide much better memory utilization of send and receive buffers for a wide variety of benchmarks and real-world applications.Unfortunately B-SRQincreases the number of connections between communicating peers. While addressing one scalability problem of InfiniBand the protocol introduced another. To alleviate the connection scalability problem of the B-SRQprotocol a small enhancement to the reliable connection transport was requested which would allow multiple shared receive queues to be attached to a single reliable connection. This modified reliable connection transport is now known as the extended reliable connection transport.X-SRQis a new transport protocol in Open MPI based on B-SRQwhich takes advantage of this improvement in connection scalability. This paper introduces the X-SRQprotocol and details the significantly improved scalability of the protocol over B-SRQand its reduction of the memory footprint of connection state by as much as 2 orders of magnitude on large scale multi-core systems. In addition to improving scalability, performance of latency-sensitive collective operations are improved by up to 38% while significantly decreasing the variability of results. A detailed analysis of the improved memory scalability as well as the improved performance are discussed.