The NoX router

  • Authors:
  • Mitchell Hayenga;Mikko Lipasti

  • Affiliations:
  • University of Wisconsin-Madison;University of Wisconsin-Madison

  • Venue:
  • Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Power efficient, low latency interconnects are increasingly important in a computing era dominated by growing core counts and diminishing power budgets. This paper proposes the use of a novel coding-based crossbar architecture to perform packet arbitration in parallel with switch traversal. The use of a lightweight exclusive-OR (XOR) coding scheme enables the productive transmission of packets, without waiting for arbitration, even under contention. For marginal cost compared to fully speculative techniques, switch arbitration latency can be hidden while eliminating power consuming misspeculations, increasing router throughput, and maintaining fairness. The new NoX router is compared to traditional sequential and speculative single cycle router implementations on a 64-node CMP mesh. Physical implementation of all routers is modeled using synthesized RTL, detailed floorplans, and accurate channel models. Performance evaluation is carried out utilizing cycle-accurate simulation and detailed power models on both synthetic and application traffic. Overall we find the NoX architecture capable of bettering average packet energy-delay product by 2.7%-34.4% on application workloads as well as improving network throughput by up to 9.9% on synthetic traffic patterns.