Link pipelining strategies for an application-specific asynchronous NoC

Authors:
Daniel Gebhardt;Junbok You;Kenneth S. Stevens
Affiliations:
University of Utah;University of Utah;University of Utah
Venue:
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Year:
2011

Citing 17
Cited 0

Chain: A Delay-Insensitive Chip Area Interconnect

IEEE Micro
Data Mining Meets Performance Evaluation: Fast Algorithms for Modeling Bursty Traffic

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An Asynchronous NOC Architecture Providing Low Latency Service and Its Multi-Level Design Framework

ASYNC '05 Proceedings of the 11th IEEE International Symposium on Asynchronous Circuits and Systems
A Router Architecture for Connection-Oriented Service Guarantees in the MANGO Clockless Network-on-Chip

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Performance driven reliable link design for networks on chips

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Designing application-specific networks on chips with floorplan information

Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
A 5-GHz Mesh Interconnect for a Teraflops Processor

IEEE Micro
Invited paper: Network-on-Chip design and synthesis outlook

Integration, the VLSI Journal
iDEAL: Inter-router Dual-Function Energy and Area-Efficient Links for Network-on-Chip (NoC) Architectures

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture

NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
Asynchronous Interconnect for Synchronous SoC Design

IEEE Micro
QNoC asynchronous router

Integration, the VLSI Journal
A methodology for constraint-driven synthesis of on-chip communications

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Low-cost router microarchitecture for on-chip networks

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Comparing Energy and Latency of Asynchronous and Synchronous NoCs for Embedded SoCs

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

Wire latency across the links of a NoC can limit throughput, especially in deep submicron technology. Stateful pipeline buffers added to long links allow a higher clock rate, but this wastes resources on links needing only low bandwidth. In asynchronous (clockless) NoCs, link pipelining can be done to only those that will benefit from both increased throughput and buffering capacity, and is especially useful in heterogeneous embedded SoCs. We evaluate two strategies that determine where link pipeline buffers should be placed in the topology. The first compares available link bandwidth, based on physical wirelength, to the throughput needed by each source-to-destination path, for each link. The second adds buffers to a link such that its bandwidth is at least equal to the throughput of a core's network adapter. These strategies were integrated into our network optimization tool for an application-specific SoC. Simulations were based on its expected traffic patterns, floorplan-derived wirelength, and uses self-similar traffic generation for more realistic behavior. Results show improved large-message network latency and output buffer delay of the network adapter. There was a slight power increase with the addition of pipeline buffers, but our proposal is a complexity-effective improvement by the power*latency product metric. The results indicate the strategy of pipelining certain links provides more efficiency opposed to a ubiquitous addition of buffers.