TornadoNoC: A lightweight and scalable on-chip network architecture for the many-core era

Authors:
Junghee Lee;Chrysostomos Nicopoulos;Hyung Gyu LEE;Jongman Kim
Affiliations:
Georgia Institute of Technology, USA;University of Cyprus;Daegu University, South Korea;Georgia Institute of Technology
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2013

Citing 37
Cited 0

An evaluation of directory schemes for cache coherence

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
System Deadlocks

ACM Computing Surveys (CSUR)
Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors

IEEE Transactions on Computers
Torus with Slotted Rings Architecture for a Cache-Coherent Multiprocessor

Proceedings of the 1994 International Conference on Parallel and Distributed Systems
A Progressive Approach to Handling Message-Dependent Deadlock in Parallel Computer Systems

IEEE Transactions on Parallel and Distributed Systems
A Performance Comparison of Hierarchical Ring- and Mesh- Connected Multiprocessor Networks

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Low-Latency Virtual-Channel Routers for On-Chip Networks

Proceedings of the 31st annual international symposium on Computer architecture
A low latency router supporting adaptivity for on-chip interconnects

Proceedings of the 42nd annual Design Automation Conference
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The design and implementation of a low-latency on-chip network

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks

Proceedings of the 33rd annual international symposium on Computer Architecture
A Statistical Traffic Model for On-Chip Interconnection Networks

MASCOTS '06 Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation
ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip Routers

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Express virtual channels: towards the ideal interconnection fabric

Proceedings of the 34th annual international symposium on Computer architecture
Approaching Ideal NoC Latency with Pre-Configured Routes

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
A Hybrid Ring/Mesh Interconnect for Network-on-Chip Using Hierarchical Rings for Global Routing

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Implications of Rent's Rule for NoC Design and Its Fault-Tolerance

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Circuit-Switched Coherence

NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
Reducing Packet Dropping in a Bufferless NoC

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Adapting the Hyper-Ring Interconnect for Many-Core Processors

ISPA '08 Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications
Rigel: an architecture and scalable programming interface for a 1000-core accelerator

Proceedings of the 36th annual international symposium on Computer architecture
A case for bufferless routing in on-chip networks

Proceedings of the 36th annual international symposium on Computer architecture
SCARAB: a single cycle adaptive routing and bufferless network

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Low-cost router microarchitecture for on-chip networks

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Asynchronous Bypass Channels: Improving Performance for Multi-synchronous NoCs

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
A low-latency adaptive asynchronous interconnection network using bi-modal router nodes

NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
CHIPPER: A low-complexity bufferless deflection router

HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Benchmarking modern multiprocessors

Benchmarking modern multiprocessors
A programmable processing array architecture supporting dynamic task scheduling and module-level prefetching

Proceedings of the 9th conference on Computing Frontiers
Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI

Proceedings of the 49th Annual Design Automation Conference
A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling

NOCS '12 Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip
MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect

NOCS '12 Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip
Déjà Vu Switching for Multiplane NoCs

NOCS '12 Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip
Design of an Energy-Efficient Asynchronous NoC and Its Optimization Tools for Heterogeneous SoCs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
WaveSync: A low-latency source synchronous bypass network-on-chip architecture

ICCD '12 Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD 2012)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The rapid emergence of Chip Multi-Processors (CMP) as the de facto microprocessor archetype has highlighted the importance of scalable and efficient on-chip networks. Packet-based Networks-on-Chip (NoC) are gradually cementing themselves as the medium of choice for the multi-/many-core systems of the near future, due to their innate scalability. However, the prominence of the debilitating power wall requires the NoC to also be as energy efficient as possible. To achieve these two antipodal requirements—scalability and energy efficiency—we propose TornadoNoC, an interconnect architecture that employs a novel flow control mechanism. To prevent livelocks and deadlocks, a sequence numbering scheme and a dynamic ring inflation technique are proposed, and their correctness formally proven. The primary objective of TornadoNoC is to achieve substantial gains in (a) scalability to many-core systems and (b) the area/power footprint, as compared to current state-of-the-art router implementations. The new router is demonstrated to provide better scalability to hundreds of cores than an ideal single-cycle wormhole implementation and other scalability-enhanced low-cost routers. Extensive simulations using both synthetic traffic patterns and real applications running in a full-system simulator corroborate the efficacy of the proposed design. Finally, hardware synthesis analysis using commercial 65nm standard-cell libraries indicates that the area and power budgets of the new router are reduced by up to 53% and 58%, respectively, as compared to existing state-of-the-art low-cost routers.