sFtree: A fully connected and deadlock-free switch-to-switch routing algorithm for fat-trees
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Proceedings of the first international workshop on Network-aware data management
Congestion avoidance on manycore high performance computing systems
Proceedings of the 26th ACM international conference on Supercomputing
Fat-tree routing and node ordering providing contention free traffic for MPI global collectives
Journal of Parallel and Distributed Computing
Traces generation to simulate large-scale distributed applications
Proceedings of the Winter Simulation Conference
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
BBQ: a straightforward queuing scheme to reduce hol-blocking in high-performance hybrid networks
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems
Future Generation Computer Systems
A new proposal to deal with congestion in InfiniBand-based fat-trees
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Clustered systems have become a dominant architecture of scalable high-performance super computers. In these large-scale computers, the network performance and scalability is as critical as the compute-nodes speed. InfiniBandTM has become a commodity networking solution supporting the stringent latency, bandwidth and scalability requirements of these clusters. The network performance is also affected by its topology, packet routing and the communication patterns the distributed application exercises. Fat-trees are the topology structures used for constructing most large clusters as they are scalable, maintain cross-bisectional-bandwidth (CBB), and are practical to build using fixed-arity switches. In this paper, we propose a fat-tree routing algorithm that provides a congestion-free, all-to-all shift pattern leveraging on the InfiniBandTM static routing capability. The algorithm supports partially populated fat-trees built with switches of arbitrary number of ports and CBB ratios. To evaluate the proposed algorithm, detailed switch and host simulation models were developed and multiple fabric topologies were run. The results of these simulations as well as measurements on real clusters show an improvement in all-to-all delay by avoiding congestion on the fabric. Copyright © 2009 John Wiley & Sons, Ltd. The paper was presented in the International Super Computer 2007 conference in Dresden Germany.