Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Network and processor architecture for message-driven computers
VLSI and parallel computation
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Communication and computation performance of the CM-5
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Performance Evaluation of Adaptive Routing Algorithms for k-ary-n-cubes
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Congestion-Free Routing on the CM-5 Data Router
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Fat-Tree Routing for Transit
Network Performance under Physical Constraints
ICPP '97 Proceedings of the international Conference on Parallel Processing
Performance Evaluation of I/O Traffic and Placement of I/O Nodes on a High Performance Network
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance Evaluation of the Quadrics Interconnection Network
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Adaptive routing in high-radix clos network
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
A performance model for analysis of heterogeneous multi-cluster systems
Parallel Computing
Hardware supported multicast in fat-tree-based InfiniBand networks
The Journal of Supercomputing
Improving communication-phase completion times in HPC clusters through congestion mitigation
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Trace-driven co-simulation of high-performance computing systems using OMNeT++
Proceedings of the 2nd International Conference on Simulation Tools and Techniques
A multi-path routing scheme for torus-based NOCs
International Journal of Computers and Applications
Reducing complexity in tree-like computer interconnection networks
Parallel Computing
Look-ahead adaptive routing on k-ary n-trees
APPT'07 Proceedings of the 7th international conference on Advanced parallel processing technologies
Proceedings of the Conference on Design, Automation and Test in Europe
An efficient strategy for reducing head-of-line blocking in fat-trees
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
OBQA: Smart and cost-efficient queue scheme for Head-of-Line blocking elimination in fat-trees
Journal of Parallel and Distributed Computing
The Journal of Supercomputing
On the influence of the selection function on the performance of fat-trees
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Towards massively parallel simulations of massively parallel high-performance computing systems
Proceedings of the 5th International ICST Conference on Simulation Tools and Techniques
Indirect cube: A power-efficient topology for compute clusters
Optical Switching and Networking
Journal of Parallel and Distributed Computing
Fat-tree routing and node ordering providing contention free traffic for MPI global collectives
Journal of Parallel and Distributed Computing
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
Channel reservation protocol for over-subscribed channels and destinations
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Fast pattern-specific routing for fat tree networks
ACM Transactions on Architecture and Code Optimization (TACO)
A new proposal to deal with congestion in InfiniBand-based fat-trees
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
The past few years have seen a rise in popularity of massively parallel architectures that use fat-trees as their interconnection networks. In this paper we study the com-munication performance of a parametric family of fat-trees, the k-ary n-trees, built with constant arity switches inter-connected in a regular topology. Through simulation on a 4-ary 4-tree with 256 nodes, we analyze some variants of an adaptive algorithm that utilize wormhole routing with one, two and four virtual channels. The experimental results show that the uniform, bit reversal and transpose traffic pat-terns are very sensitive to the flow control strategy. In all these cases, the saturation points are between 35-40% of the network capacity with one virtual channel, 55-60% with two virtual channels and around 75% with four virtual channels. The complement traffic, a representative of the class of the congestion-free communication patterns, reaches an optimal performance with a saturation point at 97% of the capacity for all flow control strategies.