Exploiting short-lived variables in superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Exploiting dead value information
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Design issues for dynamic voltage scaling
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Energy-Efficient Register Access
SBCCI '00 Proceedings of the 13th symposium on Integrated circuits and systems design
Profile-Based Dynamic Voltage Scheduling Using Program Checkpoints
Proceedings of the conference on Design, automation and test in Europe
On Reducing Register Pressure and Energy in Multiple-Banked Register Files
ICCD '03 Proceedings of the 21st International Conference on Computer Design
Reducing instruction cache energy consumption using a compiler-based strategy
ACM Transactions on Architecture and Code Optimization (TACO)
Isolating Short-Lived Operands for Energy Reduction
IEEE Transactions on Computers
Power-aware compilation for register file energy reduction
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
Late Allocation and Early Release of Physical Registers
IEEE Transactions on Computers
Intraprogram dynamic voltage scaling: Bounding opportunities with analytic modeling
ACM Transactions on Architecture and Code Optimization (TACO)
Increasing Processor Performance Through Early Register Release
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Using Performance Counters for Runtime Temperature Sensing in High-Performance Processors
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
Bypass aware instruction scheduling for register file power reduction
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Exploring the limits of early register release: Exploiting compiler analysis
ACM Transactions on Architecture and Code Optimization (TACO)
Energy-efficient register caching with compiler assistance
ACM Transactions on Architecture and Code Optimization (TACO)
Saving register-file leakage power by monitoring instruction sequence in ROB
EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing
Temperature and supply Voltage aware performance and power modeling at microarchitecture level
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Intra-task voltage scheduling on DVS-enabled hard real-time systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
AFReP: application-guided function-level registerfile power-gating for embedded processors
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Modern information technology (IT) applications make microprocessors require not only high performance, but also low power-consumption. To enhance computational performance, many instruction level parallelism techniques have been implemented in current microprocessors, e.g., data forwarding, out-of-order execution, register renaming etc. The reorder buffer (ROB) and the register file are the two most critical components to implement these features. The cooperation of them, however, causes serious static-power consumption on a physical register file which stores a large amount of speculative and committed temporary values. In this paper, we use the Pentium 4-like processor as the baseline architecture and propose a runtime approach to save the physical register file's static power. In this approach, a monitoring mechanism is built in the ROB and the register file to identify the timing of usage for each register. This mechanism can be integrated with a DVS approach on the datapath to power down (or up) the supply voltages to a register when it is idle (or active). Simulation results show that by this monitoring mechanism and a low-cost DVS design, a 128-entry register file can save at least 50% register file power consumption.