Dynamically reducing pressure on the physical register file through simple register sharing

Authors:
L. Tran;N. Nelson;Fung Ngai;S. Dropsho;M. Huang
Affiliations:
Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA;Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA;Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA;Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA;Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA
Venue:
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Year:
2004

Citing 0
Cited 11

Register Packing: Exploiting Narrow-Width Operands for Reducing Register File Pressure

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Improving Energy-Efficiency by Bypassing Trivial Computations

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
RENO: A Rename-Based Instruction Optimizer

Proceedings of the 32nd annual international symposium on Computer Architecture
SPARTAN: speculative avoidance of register allocations to transient values for performance and energy efficiency

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Early Register Deallocation Mechanisms Using Checkpointed Register Files

IEEE Transactions on Computers
Active bank switching for temperature control of the register file in a microprocessor

Proceedings of the 17th ACM Great Lakes symposium on VLSI
Speculative trivialization point advancing in high-performance processors

Journal of Systems Architecture: the EUROMICRO Journal
Predicting and Exploiting Transient Values for Reducing Register File Pressure and Energy Consumption

IEEE Transactions on Computers
Asymmetrically banked value-aware register files for low-energy and high-performance

Microprocessors & Microsystems
Early detection and bypassing of trivial operations to improve energy efficiency of processors

Microprocessors & Microsystems
Exploring the limits of early register release: Exploiting compiler analysis

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Using register renaming and physical registers, modern microprocessors eliminate false data dependences from reuse of the instruction set defined registers (logical registers). High performance processors that have longer pipelines and a greater capacity to exploit instruction-level parallelism have more instructions in-flight and require more physical registers. Simultaneous multithreading architectures further exacerbate this register pressure. This paper evaluates two register sharing techniques for reducing register usage. The first technique dynamically combines physical registers having the same value the second technique combines the demand of several instructions updating the same logical register and share physical register storage among them. While similar techniques have been proposed previously, an important contribution of this paper is to exploit only special cases that provide most of the benefits of more general solutions but at a very low hardware complexity. Despite the simplicity, our design reduces the required number of physical registers by more than 10% on some applications, and provides almost half of the total benefits of an aggressive (complex) scheme. More importantly, we show the simpler design to reduce register pressure has significant performance effects in a simultaneous multithreaded (SMT) architecture where register availability can be a bottleneck. Our results show an average of 25.6% performance improvement for an SMT architecture with 160 registers or, equivalently, similar performance as an SMT with 200 registers (25% more) but no register sharing.