Express Cubes: Improving the Performance of k-ary n-cube Interconnection Networks
IEEE Transactions on Computers
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Leakage power modeling and optimization in interconnection networks
Proceedings of the 2003 international symposium on Low power electronics and design
Microarchitectural techniques for power gating of execution units
Proceedings of the 2004 international symposium on Low power electronics and design
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
Router architecture for high-performance NoCs
Proceedings of the 20th annual conference on Integrated circuits and systems design
IEEE Micro
Extending systems-on-chip to the third dimension: performance, cost and technological tradeoffs
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Flattened Butterfly Topology for On-Chip Networks
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Run-time power gating of on-chip routers using look-ahead routing
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
3-D topologies for networks-on-chip
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Adding Slow-Silent Virtual Channels for Low-Power On-Chip Networks
NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
Supporting vertical links for 3D networks-on-chip: toward an automated design and analysis flow
Proceedings of the 2nd international conference on Nano-Networks
Exploring serial vertical interconnects for 3D ICs
Proceedings of the 46th Annual Design Automation Conference
Segment gating for static energy reduction in Networks-on-Chip
Proceedings of the 2nd International Workshop on Network on Chip Architectures
A Fault Tolerant NoC Architecture for Reliability Improvement and Latency Reduction
DSD '09 Proceedings of the 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools
Traffic- and Thermal-Aware Run-Time Thermal Management Scheme for 3D NoC Systems
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration
Proceedings of the Conference on Design, Automation and Test in Europe
Vertical interconnects squeezing in symmetric 3D mesh network-on-chip
Proceedings of the 16th Asia and South Pacific Design Automation Conference
PC-Mesh: A Dynamic Parallel Concentrated Mesh
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Three-Dimensional Chip-Multiprocessor Run-Time Thermal Management
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Catnap: energy proportional multiple network-on-chip
Proceedings of the 40th Annual International Symposium on Computer Architecture
X-Network: An area-efficient and high-performance on-chip wormhole interconnect network
Microprocessors & Microsystems
Hi-index | 0.00 |
We present the Homogeneous-Parallel-Concentrated-Mesh topology (HPC-Mesh). This NoC topology provides four disjoint homogeneous concentrated mesh networks. The network interface at each core provides connectivity to all these networks by using a novel injection algorithm. Indeed, the topology is dynamically adjusted to the working conditions of the network, minimizing power consumption by using only part of the network for low traffic rates and maximizing performance for high traffic rates by using all the networks. Therefore, the HPC-Mesh is able to adjust itself depending on the traffic demand through an intelligent injection algorithm. We perform comparison against other topologies (always using power and clock gating) with both synthetic traffic and real applications within a complete simulated system. Compared to the 2D-Mesh, on average, we reduce both the execution time by 14% and the energy consumption by 22% in real applications when using 16 cores and up to 24% in execution time and 11% in energy consumption when using 32 cores. Besides, the new topology provides a superior fault tolerance degree. It is able to work when failing up to 3 sub-networks. The extension of the topology to 3D stacked chips is also provided, exhibiting a low and practical resource overhead.