Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
A Delay Model for Router Microarchitectures
IEEE Micro
Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
SPIN: A Scalable, Packet Switched, On-Chip Micro-Network
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe: Designers' Forum - Volume 2
3D Processing Technology and Its Impact on iA32 Microprocessors
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Implementing Caches in a 3D Technology for High Performance Processors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Demystifying 3D ICs: The Pros and Cons of Going Vertical
IEEE Design & Test
Thermal analysis of a 3D die-stacked high-performance microprocessor
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory
Proceedings of the 33rd annual international symposium on Computer Architecture
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
A novel dimensionally-decomposed router for on-chip communication in 3D architectures
Proceedings of the 34th annual international symposium on Computer architecture
Flattened Butterfly Topology for On-Chip Networks
IEEE Computer Architecture Letters
3-D topologies for networks-on-chip
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
MIRA: A Multi-layered On-Chip Interconnect Router Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Networks-on-Chip in a Three-Dimensional Environment: A Performance Evaluation
IEEE Transactions on Computers
RUFT: Simplifying the Fat-Tree Topology
ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
Scalability of network-on-chip communication architecture for 3-D meshes
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Thermal-Aware Task Scheduling for 3D Multicore Processors
IEEE Transactions on Parallel and Distributed Systems
Design of High-Radix Clos Network-on-Chip
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Dynamic thermal management in 3D multicore architectures
Proceedings of the Conference on Design, Automation and Test in Europe
A 3-D cache with ultra-wide data bus for 3-D processor-memory integration
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An energy- and buffer-aware fully adaptive routing algorithm for Network-on-Chip
Microelectronics Journal
Hi-index | 0.00 |
With an increasing number of processors forming many-core chip multiprocessors (CMP), there exists a need for easily scalable, high-performance and low-power intra-chip communication infrastructure for emerging systems. In CMPs with hundreds of processing elements, 3D integration can be utilized to shorten long wires forming communication links. In this paper, we propose a Clos network-on-chip (CNOC) in conjunction with 3D integration as a viable network topology for many core CMPs. The primary benefit of 3D CNOC is scalability and a clear upper bound on power dissipation. We present the architectural and physical design of 3D CNOC and compare its performance with several other topologies. Comparisons are made among several topologies (fat tree, flattened butterfly, mesh and Clos) showing the power consumption of a 3D CNOC increases only minimally as the network size is scaled from 64 to 512 nodes relative to the other topologies. Furthermore, in a 512-node system, 3D CNOC consumes about 15% less average power than any other topology. We also compare 3D partitioning strategies for these topologies and discuss their effect on wire delay and the number of through-silicon vias.