Compacting register file via 2-level renaming and bit-partitioning

Authors:
Hua Yang;Gang Cui;Hong-Wei Liu;Xiao-Zong Yang
Affiliations:
School of Computer Science and Technology, Harbin Institute of Technology, P. Box 320, Harbin 150001, China;School of Computer Science and Technology, Harbin Institute of Technology, P. Box 320, Harbin 150001, China;School of Computer Science and Technology, Harbin Institute of Technology, P. Box 320, Harbin 150001, China;School of Computer Science and Technology, Harbin Institute of Technology, P. Box 320, Harbin 150001, China
Venue:
Microprocessors & Microsystems
Year:
2007

Citing 23
Cited 0

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Exploiting dead value information

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A novel renaming scheme to exploit value temporal locality through physical register reuse and unification

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Multiple-banked register file architectures

Proceedings of the 27th annual international symposium on Computer architecture
Two-level hierarchical register file organization for VLIW processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Integrating superscalar processor components to implement register caching

ICS '01 Proceedings of the 15th international conference on Supercomputing
Increasing processor performance by implementing deeper pipelines

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing the complexity of the register file in dynamic superscalar processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
The Alpha 21264 Microprocessor

IEEE Micro
The Design Space of Register Renaming Techniques

IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports using delayed write-back queues and operand pre-fetch

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Register File Design Considerations in Dynamically Scheduled Processors

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Virtual-Physical Registers

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
A Scalable Register File Architecture for Dynamically Scheduled Processors

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Exploiting Value Locality in Physical Register Files

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
A Content Aware Integer Register File Organization

Proceedings of the 31st annual international symposium on Computer architecture
Physical Register Inlining

Proceedings of the 31st annual international symposium on Computer architecture
A Small, Fast and Low-Power Register File by Bit-Partitioning

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
An efficient algorithm for exploiting multiple arithmetic units

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

A large multi-ported rename register file (RRF) is indispensable for simultaneous multithreaded (SMT) processors to hold more intermediate results of in-flight instructions from multiple threads running simultaneously. However, enlarging the RRF incurs longer access delays and more power consumption, both of which are critical to the overall performance and are becoming a bottleneck due to the ever-increasing pipeline depth and issue width in future SMT processors. We propose a novel register renaming scheme called Multi-usable Rename Register with 2-Level renaming and allocating (2L-MuRR), which focuses on more efficient utilization of a fewer number of rename registers. Based on the fact that the effective bit-width of most operands is narrower than the full-bit width of a register entry, 2L-MuRR partitions each rename register into several fields of different widths. Either a single field or a field combination can hold an operand, thus making each rename register multi-usable. In addition, 2L-MuRR postpones the register allocation to the write-back stage, which is similar to the formerly proposed virtual-physical-register (VPR) scheme, further reducing the meaningless RRF occupancy. The simulations show that 2L-MuRR improves the efficiency of the RRF significantly, achieving higher performance with much fewer rename registers. For example, when the RRF size is 60, 2L-MuRR outperforms Trad (traditional register renaming approach) and VPR in terms of IPC by 38% and 11%, respectively, while decreases the RRF occupancy by 37% and 15%, respectively.