Data Mining Meets Performance Evaluation: Fast Algorithms for Modeling Bursty Traffic
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An Asynchronous NOC Architecture Providing Low Latency Service and Its Multi-Level Design Framework
ASYNC '05 Proceedings of the 11th IEEE International Symposium on Asynchronous Circuits and Systems
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Performance driven reliable link design for networks on chips
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Designing application-specific networks on chips with floorplan information
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Invited paper: Network-on-Chip design and synthesis outlook
Integration, the VLSI Journal
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture
NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
Integration, the VLSI Journal
A methodology for constraint-driven synthesis of on-chip communications
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Low-cost router microarchitecture for on-chip networks
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Comparing Energy and Latency of Asynchronous and Synchronous NoCs for Embedded SoCs
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Wire latency across the links of a NoC can limit throughput, especially in deep submicron technology. Stateful pipeline buffers added to long links allow a higher clock rate, but this wastes resources on links needing only low bandwidth. In asynchronous (clockless) NoCs, link pipelining can be done to only those that will benefit from both increased throughput and buffering capacity, and is especially useful in heterogeneous embedded SoCs. We evaluate two strategies that determine where link pipeline buffers should be placed in the topology. The first compares available link bandwidth, based on physical wirelength, to the throughput needed by each source-to-destination path, for each link. The second adds buffers to a link such that its bandwidth is at least equal to the throughput of a core's network adapter. These strategies were integrated into our network optimization tool for an application-specific SoC. Simulations were based on its expected traffic patterns, floorplan-derived wirelength, and uses self-similar traffic generation for more realistic behavior. Results show improved large-message network latency and output buffer delay of the network adapter. There was a slight power increase with the addition of pipeline buffers, but our proposal is a complexity-effective improvement by the power*latency product metric. The results indicate the strategy of pipelining certain links provides more efficiency opposed to a ubiquitous addition of buffers.