The iSLIP scheduling algorithm for input-queued switches
IEEE/ACM Transactions on Networking (TON)
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
SPIN: A Scalable, Packet Switched, On-Chip Micro-Network
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe: Designers' Forum - Volume 2
Microarchitecture of a High-Radix Router
Proceedings of the 32nd annual international symposium on Computer Architecture
A Family of Mechanisms for Congestion Control in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
High Performance Switches and Routers
High Performance Switches and Routers
Flattened Butterfly Topology for On-Chip Networks
IEEE Computer Architecture Letters
RUFT: Simplifying the Fat-Tree Topology
ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
A High-Throughput Distributed Shared-Buffer NoC Router
IEEE Computer Architecture Letters
Proceedings of the Conference on Design, Automation and Test in Europe
HOPE: hotspot congestion control for Clos network on chip
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
3D NOC for many-core processors
Microelectronics Journal
Hi-index | 0.00 |
Many high-radix Network-on-Chip (NOC) topologies have been proposed to improve network performance with an ever-growing number of processing elements (PEs) on a chip. We believe Clos Network-on-Chip (CNOC) is the most promising with its low average hop counts and good load-balancing characteristics. In this paper, we propose (1) a high-radix router architecture with Virtual Output Queue (VOQ) buffer structure and Packet Mode Dual Round-Robin Matching (PDRRM) scheduling algorithm to achieve high speed and high throughput in CNOC, (2) a heuristic floor-planning algorithm to minimize the power consumption caused by the long wires. Experimental results show that the throughput of a 64-node 3-stage CNOC under uniform traffic increases from 62% to 78% by replacing the baseline routers with PDRRM VOQ routers. We also compared CNOC with other NOC topologies, and found that using the new design techniques, CNOC has the highest throughput, lowest zero-load latency, and best power efficiency.