Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
The Stanford Dash Multiprocessor
Computer
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
System-Level Point-to-Point Communication Synthesis Using Floorplanning Information
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Power-driven Design of Router Microarchitectures in On-chip Networks
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Microarchitectural Wire Management for Performance and Power in Partitioned Architectures
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
LiPaR: A light-weight parallel router for FPGA-based networks-on-chip
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Implementation analysis of NoC: a MPSoC trace-driven approach
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
NoC Topologies Exploration based on Mapping and Simulation Models
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Flattened Butterfly Topology for On-Chip Networks
IEEE Computer Architecture Letters
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Clock distribution scheme using coplanar transmission lines
Proceedings of the conference on Design, automation and test in Europe
A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
A reconfigurable source-synchronous on-chip network for GALS many-core platforms
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special issue on the 2009 ACM/IEEE international symposium on networks-on-chip
A fully-asynchronous low-power framework for GALS NoC integration
Proceedings of the Conference on Design, Automation and Test in Europe
Mesochronous NoC technology for power-efficient GALS MPSoCs
Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
Interconnected Tile Standing Wave Resonant Oscillator Based Clock Distribution Circuits
VLSID '11 Proceedings of the 2011 24th International Conference on VLSI Design
ACM SIGARCH Computer Architecture News
Architectural simulations of a fast, source-synchronous ring-based Network-on-Chip design
ICCD '12 Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD 2012)
A fast, source-synchronous ring-based network-on-chip design
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
The mesh interconnection network has been preferred by the Network-on-Chip (NoC) community due to its simple implementation, high bandwidth and overall scalability. Most existing mesh-based NoC designs operate the mesh at the same or lower clock speed as the processing elements (PEs). Recently, a new source synchronous ring-based NoC architecture has been proposed, which runs significantly faster than the PEs and offers a significantly higher bandwidth and lower communication latency. The authors implement the NoC topology as a mesh of rings, which occupies the same area as that of a mesh. In this work, we evaluate two alternate source synchronous ring-based NoC topologies called the ring of stars (ROS) and the spine with rings (SWR), which occupy a much lower area, and are able to provide better performance in terms of communication latency compared to a state of the art mesh. In our proposed topologies, the clock and the data NoC are routed in parallel, yielding a fast, synchronous, robust design. Our design allows the PEs to extract a low jitter clock from the high speed ring clock by division. The area and performance of these ring-based NoC topologies is quantified. Experimental results on synthetic traffic show that the new ring-based NoC designs can provide significantly lower latency (upto 4.6×) compared to a state of the art mesh. The proposed floorplan-friendly topologies use fewer buffers (upto 50% less) and lower wire length (upto 64.3% lower) compared to the mesh. Depending on the performance and the area desired, a NoC designer can select among the topologies presented.