Design Challenges of Technology Scaling
IEEE Micro
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Compile-time dynamic voltage scaling settings: opportunities and limits
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
PowerHerd: dynamic satisfaction of peak power constraints in interconnection networks
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamic Thermal Management for High-Performance Microprocessors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Low-Latency Virtual-Channel Routers for On-Chip Networks
Proceedings of the 31st annual international symposium on Computer architecture
Thermal Modeling, Characterization and Management of On-Chip Networks
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Statistical Analysis of Clock Skew Variation in H-Tree Structure
ISQED '05 Proceedings of the 6th International Symposium on Quality of Electronic Design
A Holistic Approach to Designing Energy-Efficient Cluster Interconnects
IEEE Transactions on Computers
A low latency router supporting adaptivity for on-chip interconnects
Proceedings of the 42nd annual Design Automation Conference
Design and analysis of an NoC architecture from performance, reliability and energy perspective
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Mitigating the Impact of Process Variations on Processor Register Files and Execution Units
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ReCycle:: pipeline adaptation to tolerate process variation
Proceedings of the 34th annual international symposium on Computer architecture
Interconnect design considerations for large NUCA caches
Proceedings of the 34th annual international symposium on Computer architecture
Technology-Driven, Highly-Scalable Dragonfly Topology
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ReVIVaL: A Variation-Tolerant Architecture Using Voltage Interpolation and Variable Latency
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A case for bufferless routing in on-chip networks
Proceedings of the 36th annual international symposium on Computer architecture
A variable-pipeline on-chip router optimized to traffic pattern
Proceedings of the Third International Workshop on Network on Chip Architectures
Shared Register File Based ILP for Multicore
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
A case for heterogeneous on-chip interconnects for CMPs
Proceedings of the 38th annual international symposium on Computer architecture
BarrierWatch: characterizing multithreaded workloads across and within program-defined epochs
Proceedings of the 8th ACM International Conference on Computing Frontiers
NoC frequency scaling with flexible-pipeline routers
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Energy-efficient non-minimal path on-chip interconnection network for heterogeneous systems
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
APCR: an adaptive physical channel regulator for on-chip interconnects
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Designing energy-efficient NoC for real-time embedded systems through slack optimization
Proceedings of the 50th Annual Design Automation Conference
Dynamic voltage and frequency scaling for shared resources in multicore processor designs
Proceedings of the 50th Annual Design Automation Conference
Application-driven end-to-end traffic predictions for low power NoC design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
In-network monitoring and control policy for DVFS of CMP networks-on-chip and last level caches
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Hi-index | 0.00 |
Performance and power are the first order design metrics for Network-on-Chips (NoCs) that have become the de-facto standard in providing scalable communication backbones for multicores/CMPs. However, NoCs can be plagued by higher power consumption and degraded throughput if the network and router are not designed properly. Towards this end, this paper proposes a novel router architecture, where we tune the frequency of a router in response to network load to manage both performance and power. We propose three dynamic frequency tuning techniques, FreqBoost, FreqThrtl and FreqTune, targeted at congestion and power management in NoCs. As enablers for these techniques, we exploit Dynamic Voltage and Frequency Scaling (DVFS) and the imbalance in a generic router pipeline through time stealing. Experiments using synthetic workloads on a 8x8 wormhole-switched mesh interconnect show that FreqBoost is a better choice for reducing average latency (maximum 40%) while, FreqThrtl provides the maximum benefits in terms of power saving and energy delay product (EDP). The FreqTune scheme is a better candidate for optimizing both performance and power, achieving on an average 36% reduction in latency, 13% savings in power (up to 24% at high load), and 40% savings (up to 70% at high load) in EDP. With application benchmarks, we observe IPC improvement up to 23% using our design. The performance and power benefits also scale for larger NoCs.