Checkpoint repair for out-of-order execution machines
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
IEEE Transactions on Computers
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 24th annual international symposium on Computer architecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
A framework for balancing control flow and predication
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Confidence estimation for speculation control
Proceedings of the 25th annual international symposium on Computer architecture
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Reducing branch misprediction penalties via dynamic control independence detection
ICS '99 Proceedings of the 13th international conference on Supercomputing
Skipper: a microarchitecture for exploiting control-flow independence
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A Study of Control Independence in Superscalar Processors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Checkpointing alternatives for high performance, power-aware processors
Proceedings of the 2003 international symposium on Low power electronics and design
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
An analysis of a resource efficient checkpoint architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing Branch Misprediction Penalty via Selective Branch Recovery
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Fast branch misprediction recovery in out-of-order superscalar processors
Proceedings of the 19th annual international conference on Supercomputing
A physical level study and optimization of CAM-based checkpointed register alias table
Proceedings of the 13th international symposium on Low power electronics and design
A distributed processor state management architecture for large-window processors
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Reexecution and Selective Reuse in Checkpoint Processors
Transactions on High-Performance Embedded Architectures and Compilers II
Checkpoint allocation and release
ACM Transactions on Architecture and Code Optimization (TACO)
An energy-efficient checkpointing mechanism for out of order commit processor
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
On the latency and energy of checkpointed superscalar register alias tables
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Checkpoint prediction and intelligent management have been recently proposed for reducing the number of coarse-grain checkpoints needed to achieve high performance through speculative execution. In this work, we take a closer look at various checkpoint prediction and management alternatives, comparing their performance and requirements as the scheduler window size increases. We also study a few additional design choices. The key contribution of this work is BranchTap, a novel checkpoint-aware speculation strategy that temporarily throttles speculation to reduce recovery cost while allowing speculation to proceed when it is likely to boost performance. BranchTap dynamically adapts to application behavior. We demonstrate that for a 1K-entry window processor with a FIFO of just four checkpoints, our adaptive speculation control mechanism leads to an average performance degradation of just 1.49% compared to a processor that has an infinite number of checkpoints. This represents an improvement of 28.3% over using just prediction-based checkpoint allocation. Average performance degradation without BranchTap is 2.08%. For the same configuration, BranchTap decreases the worst case deterioration from 8.99% to 5.64%.