A Content Aware Integer Register File Organization
Proceedings of the 31st annual international symposium on Computer architecture
Late Allocation and Early Release of Physical Registers
IEEE Transactions on Computers
Register Packing: Exploiting Narrow-Width Operands for Reducing Register File Pressure
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Compiler Directed Early Register Release
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Speculative early register release
Proceedings of the 3rd conference on Computing frontiers
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Early Register Deallocation Mechanisms Using Checkpointed Register Files
IEEE Transactions on Computers
Selective writeback: exploiting transient values for energy-efficiency and performance
Proceedings of the 2006 international symposium on Low power electronics and design
An L2-miss-driven early register deallocation for SMT processors
Proceedings of the 21st annual international conference on Supercomputing
IEEE Transactions on Computers
Reducing register pressure in SMT processors through L2-miss-driven early register release
ACM Transactions on Architecture and Code Optimization (TACO)
Selective writeback: reducing register file pressure and energy consumption
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploring the limits of early register release: Exploiting compiler analysis
ACM Transactions on Architecture and Code Optimization (TACO)
Energy-efficient register caching with compiler assistance
ACM Transactions on Architecture and Code Optimization (TACO)
Folding active list for high performance and low power
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Hi-index | 0.01 |
Register files are becoming one of the critical components of current out-of-order processors in terms of delay and power consumption, since their potential to exploit instruction-level parallelism is quite related to the size and number of ports of the register file. In conventional register renaming schemes, register releasing is conservatively done only after the instruction that redefines the same register is committed. Instead, we propose a scheme that releases registers as soon as the processor knows that there will be no further use of them. We present two early-releasing hardware implementations with different performance/complexity trade-offs. Detailed cycle-level simulations show either a significant speedup for a given register file size, or a reduction in register file size for a given performance level.