A physical level study and optimization of CAM-based checkpointed register alias table

Authors:
Elham Safi;Andreas Moshovos;Andreas Veneris
Affiliations:
University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada
Venue:
Proceedings of the 13th international symposium on Low power electronics and design
Year:
2008

Citing 8
Cited 3

The MIPS R10000 Superscalar Microprocessor

IEEE Micro
The Alpha 21264 Microprocessor

IEEE Micro
A Circuit-Level Implementation of Fast, Energy-Efficient CMOS Comparators for High-Performance Microprocessors

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Checkpointing alternatives for high performance, power-aware processors

Proceedings of the 2003 international symposium on Low power electronics and design
Complexity-effective superscalar processors

Complexity-effective superscalar processors
An analysis of a resource efficient checkpoint architecture

ACM Transactions on Architecture and Code Optimization (TACO)
BranchTap: improving performance with very few checkpoints through adaptive speculation control

Proceedings of the 20th annual international conference on Supercomputing
On the latency, energy and area of checkpointed, superscalar register alias tables

ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design

A power-aware hybrid RAM-CAM renaming mechanism for fast recovery

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
On the latency and energy of checkpointed superscalar register alias tables

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploiting replicated checkpoints for soft error detection and correction

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using full-custom layouts in 130 nm technology, this work studies how the latency and energy of a checkpointed, CAM-based Register Alias Table (cRAT) vary as a function of the window size, the issue width, and the number of embedded global checkpoints (GCs). These results are compared to those of the SRAM-based RAT (sRAT). Understanding these variations is useful during the early stages of architectural exploration where physical level information is not yet available. It is found that compared to sRAT, cRAT is more sensitive to the number of physical registers and issue width, however, it is less sensitive to the number of GCs. In addition, beyond a certain number of GCs, cRAT becomes faster than its equivalent sRAT. For instance, this is true when a RAT for 64 architectural and 128 physical registers has at least 20 GCs. This work also proposes an energy optimization for the cRAT; this optimization selectively disables cRAT entries that do not result in a match during lookup. The energy savings are, for the most part, a function of the number of physical registers. For instance, for a cRAT with 128 entries energy is reduced by 40%.