Improving communication-phase completion times in HPC clusters through congestion mitigation
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Dynamic and Distributed Multipath Routing Policy for High-Speed Cluster Networks
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Channel reservation protocol for over-subscribed channels and destinations
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.01 |
InfiniBand is quickly becoming the choice of interconnection networks for High Performance Computing (HPC) systems. It defines a System Area Network (SAN) environment where multiple processor nodes and I/O devices are interconnected using a switched point-to-point network. The unique characteristics of InfiniBand networks, such as no packet dropping, small buffer size and low latency, make the congestion control different from the traditional mechanisms for other high-speed networks and propose more challenges. This paper presents an improved Explicit Congestion Notification (ECN) packet marking mechanism for InfiniBand. In addition we propose an effective source response functionPower Increase and Power Decrease (PIPD) which adopts the rate control with a window limit to reduce congestion of multiple-class traffic in InfiniBand networks. Simulation experiments have demonstrated that the proposed new resource response function is quite effective in InfiniBand networks and outperforms the existing ECN schemes.