Randomizing task placement does not randomize traffic (enough)
Proceedings of the 2013 Interconnection Network Architecture: On-Chip, Multi-Chip
Global misrouting policies in two-level hierarchical networks
Proceedings of the 2013 Interconnection Network Architecture: On-Chip, Multi-Chip
Performance implications of remote-only load balancing under adversarial traffic in Dragonflies
Proceedings of the 8th International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
Hi-index | 0.00 |
Dragonfly networks have been recently proposed for the interconnection network of forthcoming exascale supercomputers. Relying on large-radix routers, they build a topology with low diameter and high throughput, divided into multiple groups of routers. While minimal routing is appropriate for uniform traffic patterns, adversarial traffic patterns can saturate inter-group links and degrade the obtained performance. Such traffic patterns occur in typical communication patterns used by many HPC applications, such as neighbor data exchanges in multi-dimensional space decompositions. Non-minimal traffic routing is employed to handle such cases. Adaptive policies have been designed to select between minimal and nonminimal routing to handle variable traffic patterns. However, previous papers have not taken into account the effect of saturation of intra-group (local) links. This paper studies how local link saturation can be common in these networks, and shows that it can largely reduce the performance. The solution to this problem is to use nonminimal paths that avoid those saturated local links. However, this extends the maximum path length, and since all previous routing proposals prevent deadlock by relying on an ascending order of virtual channels, it would imply unaffordable cost and complexity in the network routers. In this paper we introduce a novel routing/flow-control scheme that decouples the routing and the deadlock avoidance mechanisms. Our model does not impose any dependencies between virtual channels, allowing for on-the-fly (in-transit) adaptive routing of packets. To prevent deadlock we employ a deadlock-free escape sub network based on injection restriction. Simulations show that our model obtains lower latency, higher throughput, and faster adaptation to transient traffic, because it dynamically exploits a higher path diversity to avoid saturated links. Notably, our proposal consumes traffic bursts 43% faster than previous ones.