Isolating Short-Lived Operands for Energy Reduction

Authors:
Dmitry Ponomarev;Gurhan Kucuk;Oguz Ergin;Kanad Ghose
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Computers
Year:
2004

Citing 21
Cited 5

Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Exploiting short-lived variables in superscalar processors

Proceedings of the 28th annual international symposium on Microarchitecture
Register renaming and dynamic speculation: an alternative approach

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Pipeline gating: speculation control for energy reduction

Proceedings of the 25th annual international symposium on Computer architecture
Implementation of precise interrupts in pipelined processors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Multiple-banked register file architectures

Proceedings of the 27th annual international symposium on Computer architecture
Energy-effective issue logic

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Low-complexity reorder buffer architecture

ICS '02 Proceedings of the 16th international conference on Supercomputing
Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing the complexity of the register file in dynamic superscalar processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The PowerPC 604 RISC microprocessor

IEEE Micro
The Alpha 21264 Microprocessor

IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports for higher speed and lower energy

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports using delayed write-back queues and operand pre-fetch

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A Circuit-Level Implementation of Fast, Energy-Efficient CMOS Comparators for High-Performance Microprocessors

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Banked multiported register files for high-frequency superscalar microprocessors

Proceedings of the 30th annual international symposium on Computer architecture
Reducing reorder buffer complexity through selective operand caching

Proceedings of the 2003 international symposium on Low power electronics and design
Loose Loops Sink Chips

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
A Scalable Register File Architecture for Dynamically Scheduled Processors

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Reducing Datapath Energy through the Isolation of Short-Lived Operands

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques

Continuous Optimization

Proceedings of the 32nd annual international symposium on Computer Architecture
LIRAC: using live range information to optimize memory access

ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Saving register-file static power by monitoring instruction sequence in ROB

Journal of Systems Architecture: the EUROMICRO Journal
Live range aware cache architecture

ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Saving register-file leakage power by monitoring instruction sequence in ROB

EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing

Quantified Score

Hi-index	14.98

Visualization

Abstract

Abstract--A mechanism for reducing the power requirements in processors that use a separate (architectural) register file (ARF) for holding committed values is proposed in this paper. We exploit the notion of short-lived operands--values that target architectural registers that are renamed by the time the instruction producing the value reaches the writeback stage. Our simulations of the SPEC 2000 benchmarks show that as much as 71 percent to 97 percent of the results are short-lived. Our technique avoids unnecessary writebacks into the result repository (a slot within the Reorder Buffer or a physical register) as well as writes into the ARF from unnecessary commitments by caching (and isolating) short-lived operands within a small dedicated register file. Operands are cached in this manner till they can be safely discarded without jeopardizing the recovery from possible branch mispredictions or reconstruction of the precise state in case of interrupts or exceptions. Additional energy savings are achieved by limiting the number of ports used for instruction commitment. The power/energy savings are validated using SPICE measurements of actual layouts in a 0.18 micron CMOS process. The energy reduction in the ROB and the ARF is about 20 percent (translating into the overall chip energy reduction of about 5 percent) and this is achieved with no increase in cycle time, little additional complexity, and no degradation in the number of instructions committed per cycle.