The MIPS R10000 Superscalar Microprocessor
IEEE Micro
The Design of a Register Renaming Unit
GLS '99 Proceedings of the Ninth Great Lakes Symposium on VLSI
Checkpointing alternatives for high performance, power-aware processors
Proceedings of the 2003 international symposium on Low power electronics and design
Complexity-effective superscalar processors
Complexity-effective superscalar processors
Inherently lower-power high-performance superscalar architectures
Inherently lower-power high-performance superscalar architectures
An analysis of a resource efficient checkpoint architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing Rename Logic Complexity for High-Speed and Low-Power Front-End Architectures
IEEE Transactions on Computers
A physical level study and optimization of CAM-based checkpointed register alias table
Proceedings of the 13th international symposium on Low power electronics and design
Proceedings of the 36th annual international symposium on Computer architecture
Turbo-ROB: a low cost checkpoint/restore accelerator
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
On the latency and energy of checkpointed superscalar register alias tables
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploiting replicated checkpoints for soft error detection and correction
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
We present two full-custom implementations of the Register Alias Table (RAT) for a 4-way superscalar dynamically-scheduled processor in a commercial 130nm CMOS technology. The implementations differ in the way they organize the embedded global checkpoints (GCs) which support speculative execution. In the first implementation, representative of early designs, the GCs are organized as shift registers. In the second implementation, representative of more recent proposals, the GCs are organized as random access buffers. We measure the impact of increasing thenumber of GCs on the latency, energy, and area of the RAT. The results support the importance of recent techniques that reduce the number of GCs while maintaining performance.