Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
IEEE Transactions on Parallel and Distributed Systems
Layered Shortest Path (LASH) Routing in Irregular System Area Networks
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
K-ary N-trees: High Performance Networks for Massively Parallel Architectures
K-ary N-trees: High Performance Networks for Massively Parallel Architectures
Optimized InfiniBandTM fat-tree routing for shift all-to-all communication patterns
Concurrency and Computation: Practice & Experience - International Supercomputing Conference (ISC07)
Host Side Dynamic Reconfiguration with InfiniBand
CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
Achieving Predictable High Performance in Imbalanced Fat Trees
ICPADS '10 Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed Systems
On the Relation between Congestion Control, Switch Arbitration and Fairness
CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Deadlock-Free Oblivious Routing for Arbitrary Topologies
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
vFtree - A Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
InfiniBand congestion control: modelling and validation
Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques
An overview of QoS capabilities in infiniband, advanced switching interconnect, and ethernet
IEEE Communications Magazine
Hi-index | 0.00 |
Existing fat-tree routing algorithms fully exploit the path diversity of a fat-tree topology in the context of compute node traffic, but they lack support for deadlock-free and fully connected switch-to-switch communication. Such support is crucial for efficient system management, for example, in InfiniBand (IB) systems. With the general increase in system management capabilities found in modern InfiniBand switches, the lack of deadlock-free switch-to-switch communication is a problem for fat-tree-based IB installations because management traffic might cause routing deadlocks that bring the whole system down. This lack of deadlock-free communication affects all system management and diagnostic tools using LID routing. In this paper, we propose the sFtree routing algorithm that guarantees deadlock-free and fully connected switch-to-switch communication in fat-trees while maintaining the properties of the current fat-tree algorithm. We prove that the algorithm is deadlock free and we implement it in OpenSM for evaluation. We evaluate the performance of the sFtree algorithm experimentally on a small cluster and we do a large-scale evaluation through simulations. The results confirm that the sFtree routing algorithm is deadlock-free and show that the impact of switch-to-switch management traffic on the end-node traffic is negligible.