BranchTap: improving performance with very few checkpoints through adaptive speculation control

Authors:
Patrick Akl;Andreas Moshovos
Affiliations:
University of Toronto;University of Toronto
Venue:
Proceedings of the 20th annual international conference on Supercomputing
Year:
2006

Citing 19
Cited 6

Checkpoint repair for out-of-order execution machines

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Implementing Precise Interrupts in Pipelined Processors

IEEE Transactions on Computers
Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers

IEEE Transactions on Computers
Assigning confidence to conditional branch predictions

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Dynamic instruction reuse

Proceedings of the 24th annual international symposium on Computer architecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
A framework for balancing control flow and predication

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Confidence estimation for speculation control

Proceedings of the 25th annual international symposium on Computer architecture
Pipeline gating: speculation control for energy reduction

Proceedings of the 25th annual international symposium on Computer architecture
Reducing branch misprediction penalties via dynamic control independence detection

ICS '99 Proceedings of the 13th international conference on Supercomputing
Skipper: a microarchitecture for exploiting control-flow independence

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A Study of Control Independence in Superscalar Processors

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Checkpointing alternatives for high performance, power-aware processors

Proceedings of the 2003 international symposium on Low power electronics and design
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
An analysis of a resource efficient checkpoint architecture

ACM Transactions on Architecture and Code Optimization (TACO)
Reducing Branch Misprediction Penalty via Selective Branch Recovery

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Fast branch misprediction recovery in out-of-order superscalar processors

Proceedings of the 19th annual international conference on Supercomputing

A physical level study and optimization of CAM-based checkpointed register alias table

Proceedings of the 13th international symposium on Low power electronics and design
A distributed processor state management architecture for large-window processors

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Reexecution and Selective Reuse in Checkpoint Processors

Transactions on High-Performance Embedded Architectures and Compilers II
Checkpoint allocation and release

ACM Transactions on Architecture and Code Optimization (TACO)
An energy-efficient checkpointing mechanism for out of order commit processor

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
On the latency and energy of checkpointed superscalar register alias tables

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Checkpoint prediction and intelligent management have been recently proposed for reducing the number of coarse-grain checkpoints needed to achieve high performance through speculative execution. In this work, we take a closer look at various checkpoint prediction and management alternatives, comparing their performance and requirements as the scheduler window size increases. We also study a few additional design choices. The key contribution of this work is BranchTap, a novel checkpoint-aware speculation strategy that temporarily throttles speculation to reduce recovery cost while allowing speculation to proceed when it is likely to boost performance. BranchTap dynamically adapts to application behavior. We demonstrate that for a 1K-entry window processor with a FIFO of just four checkpoints, our adaptive speculation control mechanism leads to an average performance degradation of just 1.49% compared to a processor that has an infinite number of checkpoints. This represents an improvement of 28.3% over using just prediction-based checkpoint allocation. Average performance degradation without BranchTap is 2.08%. For the same configuration, BranchTap decreases the worst case deterioration from 8.99% to 5.64%.