Register renaming and dynamic speculation: an alternative approach
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
The energy complexity of register files
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Implementation of precise interrupts in pipelined processors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Two-level hierarchical register file organization for VLIW processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Exploiting data forwarding to reduce the power budget of VLIW embedded processors
Proceedings of the conference on Design, automation and test in Europe
Dynamically allocating processor resources between nearby and distant ILP
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports for higher speed and lower energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports using delayed write-back queues and operand pre-fetch
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor
COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Banked multiported register files for high-frequency superscalar microprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Reducing Datapath Energy through the Isolation of Short-Lived Operands
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Exploiting Value Locality in Physical Register Files
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
A Content Aware Integer Register File Organization
Proceedings of the 31st annual international symposium on Computer architecture
Proceedings of the 31st annual international symposium on Computer architecture
POWER4 system microarchitecture
IBM Journal of Research and Development
Hi-index | 0.00 |
Register file design is one of the critical issues facing designers of out–of–order processors. Scaling up its size and number of ports with issue width and instruction window size is difficult in terms of both performance and power consumption. Two types of register file architectures have been proposed in the past: a future logical file and a centralized physical file. The centralized register file does not scale well but allows fast branch mis–prediction recovery. The Future File scales well, but requires reservation stations and has slow mis–prediction recovery. This paper proposes a register file architecture that combines the best features of both approaches. The new register file has the large size of the centralized file and its ability to quickly recover from branch misprediction. It has the advantage of the future file in that it is accessed in the ”front end” allowing about 1/3rd of the source operands that are ready when an instruction enters the window to be read immediately. The remaining operands come from bypass logic / instruction queues and do not require register file access. The new architecture does require reservation stations for operand storage and it investigates two approaches in terms of power–efficiency. Another advantage of the new architecture is that banking is much easier to use in this case as compared to the centralized register file. Banking further improves the scalability of the new architecture. A technique for early release of short–lived registers called writeback filtering is used in combination with banking to further improve the new architecture. The use of a large front–end register file results in significant power savings and a slight IPC degradation (less than 1%). Overall, the resulting energy–delay product is lower than in previous proposals.