A reliable and scalable striping protocol
Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Parallel view-dependent isosurface extraction using multi-pass occlusion culling
PVG '01 Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems
IEEE Transactions on Parallel and Distributed Systems
A network-failure-tolerant message-passing system for terascale clusters
ICS '02 Proceedings of the 16th international conference on Supercomputing
Ultra-high performance communication with MPI and the Sun fire™ link interconnect
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
High performance RDMA-based MPI implementation over InfiniBand
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Striping within the network subsystem
IEEE Network: The Magazine of Global Internetworking
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Improving communication-phase completion times in HPC clusters through congestion mitigation
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Employing transport layer multi-railing in cluster networks
Journal of Parallel and Distributed Computing
Region-Based Prefetch Techniques for Software Distributed Shared Memory Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Efficient RDMA-based multi-port collectives on multi-rail QsNetII clusters
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Optimizing bandwidth limited problems using one-sided communication and overlap
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Adaptive connection management for scalable MPI over InfiniBand
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Using CMT in SCTP-based MPI to exploit multiple interfaces in cluster nodes
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
In the area of cluster computing, InfiniBand is becoming increasingly popular due to its open standard and high performance. However, even with InfiniBand, network bandwidth can still become the performance bottleneck for some of todayýs most demanding applications. In this paper, we study the problem of how to overcome the bandwidth bottleneck by using multirail networks. We present different ways of setting up multirail networks with InfiniBand and propose a unified MPI design that can support all these approaches. We have also discussed various important design issues and provided in-depth discussions of different policies of using multirail networks, including an adaptive striping scheme that can dynamically change the striping parameters based on current system condition. We have implemented our design and evaluated it using both microbenchmarks and applications. Our performance results show that multirail networks can significant improve MPI communication performance. With a two rail InfiniBand cluster, we have achieved almost twice the bandwidth and half the latency for large messages compared with the original MPI. At the application level, the multirail MPI can significantly reduce communication time as well as running time depending on the communication pattern. We have also shown that the adaptive striping scheme can achieve excellent performance without a priori knowledge of the bandwidth of each rail.