Comparing algorithm for dynamic speed-setting of a low-power CPU
MobiCom '95 Proceedings of the 1st annual international conference on Mobile computing and networking
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Real-time dynamic voltage scaling for low-power embedded operating systems
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Power and performance evaluation of globally asynchronous locally synchronous processors
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Power-efficient issue queue design
Power aware computing
Low-Latency Asynchronous FIFO's Using Token Rings
ASYNC '00 Proceedings of the 6th International Symposium on Advanced Research in Asynchronous Circuits and Systems
Interfacing Synchronous and Asynchronous Modules Within a High-Speed Pipeline
ARVLSI '97 Proceedings of the 17th Conference on Advanced Research in VLSI (ARVLSI '97)
A Low-Latency FIFO for Mixed-Clock Systems
WVLSI '00 Proceedings of the IEEE Computer Society Annual Workshop on VLSI (WVLSI'00)
Circuit Implementation of a 600MHz Superscalar RISC Microprocessor
ICCD '98 Proceedings of the International Conference on Computer Design
The FAB Predictor: Using Fourier Analysis to Predict the Outcome of Conditional Branches
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor
Proceedings of the 30th annual international symposium on Computer architecture
A mixed-clock issue queue design for globally asynchronous, locally synchronous processor cores
Proceedings of the 2003 international symposium on Low power electronics and design
Fine-grained power management for multithreaded processor cores
Proceedings of the 2004 ACM symposium on Applied computing
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing pipeline energy demands with local DVS and dynamic retiming
Proceedings of the 2004 international symposium on Low power electronics and design
Application adaptive energy efficient clustered architectures
Proceedings of the 2004 international symposium on Low power electronics and design
Mixed-clock issue queue design for energy aware, high-performance cores
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Formal online methods for voltage/frequency control in multiple clock domain microprocessors
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
The Energy Impact of Aggressive Loop Fusion
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Robust interfaces for mixed-timing systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamically Trading Frequency for Complexity in a GALS Microprocessor
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Exploiting Barriers to Optimize Power Consumption of CMPs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
A flexible simulation framework for graphics architectures
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Variability and energy awareness: a microarchitecture-level perspective
Proceedings of the 42nd annual Design Automation Conference
Coordinated, distributed, formal energy management of chip multiprocessors
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Toward a multiple clock/voltage island design style for power-aware processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Power reduction techniques for microprocessor systems
ACM Computing Surveys (CSUR)
A simulation-based study of wireless sensor network middleware
International Journal of Network Management
Energy aware multiple clock domain scheduling for a bit-serial, self-timed architecture
SBCCI '06 Proceedings of the 19th annual symposium on Integrated circuits and systems design
Independent front-end and back-end dynamic voltage scaling for a GALS microarchitecture
Proceedings of the 2006 international symposium on Low power electronics and design
An intra-task dvfs technique based on statistical analysis of hardware events
Proceedings of the 4th international conference on Computing frontiers
A self-tuning configurable cache
Proceedings of the 44th annual Design Automation Conference
Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP
Journal of Parallel and Distributed Computing
Compiler-directed frequency and voltage scaling for a multiple clock domain microarchitecture
Proceedings of the 5th conference on Computing frontiers
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Proceedings of the 13th international symposium on Low power electronics and design
A Non-blocking Multithreaded Architecture with Support for Speculative Threads
ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
Integrated Circuit and System Design. Power and Timing Modeling, Optimization and Simulation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Two-phase synchronization with sub-cycle latency
Integration, the VLSI Journal
Dynamic MIPS rate stabilization in out-of-order processors
Proceedings of the 36th annual international symposium on Computer architecture
Improving SMT performance: an application of genetic algorithms to configure resizable caches
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Low Vccmin fault-tolerant cache with highly predictable performance
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Dynamic capacity-speed tradeoffs in SMT processor caches
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Integrated CPU cache power management in multiple clock domain processors
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
The implementation and evaluation of a low-power clock distribution network based on EPIC
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Online energy-saving algorithm for sensor networks in dynamic changing environments
Journal of Embedded Computing
Power aware SID-based simulator for embedded multicore DSP subsystems
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Analysis of integrated circuits thermal dynamics with point heating time
Microelectronics Journal
Fine-grained DVFS using on-chip regulators
ACM Transactions on Architecture and Code Optimization (TACO)
Energy-Aware Loop Parallelism Maximization for Multi-core DSP Architectures
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
High rate data synchronization in GALS socs
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploiting dynamic micro-architecture usage in gate sizing
Microprocessors & Microsystems
A phase adaptive cache hierarchy for SMT processors
Microprocessors & Microsystems
RISC/DSP dual core wireless soc processor focused on multimedia applications
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Characterizing the performance and energy attributes of scientific simulations
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Dynamic instruction cascading on GALS microprocessors
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Dynamic processor throttling for power efficient computations
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Exploring multi-threaded Java application performance on multicore hardware
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Architecturally homogeneous power-performance heterogeneous multicore systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
In-network monitoring and control policy for DVFS of CMP networks-on-chip and last level caches
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Hi-index | 0.00 |
We describe the design, analysis, and performance of an on--line algorithm to dynamically control the frequency/voltage of a Multiple Clock Domain (MCD) microarchitecture. The MCD microarchitecture allows the frequency/voltage of microprocessor regions to be adjusted independently and dynamically, allowing energy savings when the frequency of some regions can be reduced without significantly impacting performance.Our algorithm achieves on average a 19.0% reduction in Energy Per Instruction (EPI), a 3.2% increase in Cycles Per Instruction (CPI), a 16.7% improvement in Energy--Delay Product, and a Power Savings to Performance Degradation ratio of 4.6. Traditional frequency/voltage scaling techniques which apply reductions globally to a fully synchronous processor achieve a Power Savings to Performance Degradation ratio of only 2--3. Our Energy--Delay Product improvement is 85.5% of what has been achieved using an off--line algorithm. These results were achieved using a broad range of applications from the MediaBench, Olden, and Spec2000 benchmark suites using an algorithm we show to require minimal hardware resources.