On-the-Fly Adaptive Routing in High-Radix Hierarchical Networks

  • Authors:
  • Marina Garcia;Enrique Vallejo;Ramon Beivide;Miguel Odriozola;Cristobal Camarero;Mateo Valero;German Rodriguez;Jesus Labarta;Cyriel Minkenberg

  • Affiliations:
  • -;-;-;-;-;-;-;-;-

  • Venue:
  • ICPP '12 Proceedings of the 2012 41st International Conference on Parallel Processing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dragonfly networks have been recently proposed for the interconnection network of forthcoming exascale supercomputers. Relying on large-radix routers, they build a topology with low diameter and high throughput, divided into multiple groups of routers. While minimal routing is appropriate for uniform traffic patterns, adversarial traffic patterns can saturate inter-group links and degrade the obtained performance. Such traffic patterns occur in typical communication patterns used by many HPC applications, such as neighbor data exchanges in multi-dimensional space decompositions. Non-minimal traffic routing is employed to handle such cases. Adaptive policies have been designed to select between minimal and nonminimal routing to handle variable traffic patterns. However, previous papers have not taken into account the effect of saturation of intra-group (local) links. This paper studies how local link saturation can be common in these networks, and shows that it can largely reduce the performance. The solution to this problem is to use nonminimal paths that avoid those saturated local links. However, this extends the maximum path length, and since all previous routing proposals prevent deadlock by relying on an ascending order of virtual channels, it would imply unaffordable cost and complexity in the network routers. In this paper we introduce a novel routing/flow-control scheme that decouples the routing and the deadlock avoidance mechanisms. Our model does not impose any dependencies between virtual channels, allowing for on-the-fly (in-transit) adaptive routing of packets. To prevent deadlock we employ a deadlock-free escape sub network based on injection restriction. Simulations show that our model obtains lower latency, higher throughput, and faster adaptation to transient traffic, because it dynamically exploits a higher path diversity to avoid saturated links. Notably, our proposal consumes traffic bursts 43% faster than previous ones.