Fast branch misprediction recovery in out-of-order superscalar processors

Authors:
Peng Zhou;Soner Önder;Steve Carr
Affiliations:
Michigan Technological University, Houghton, Michigan;Michigan Technological University, Houghton, Michigan;Michigan Technological University, Houghton, Michigan
Venue:
Proceedings of the 19th annual international conference on Supercomputing
Year:
2005

Citing 19
Cited 8

Checkpoint repair for out-of-order execution machines

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Alternative implementations of two-level adaptive branch prediction

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
The agree predictor: a mechanism for reducing negative branch history interference

Proceedings of the 24th annual international symposium on Computer architecture
The bi-mode branch predictor

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Memory dependence prediction using store sets

Proceedings of the 25th annual international symposium on Computer architecture
Dynamic memory disambiguation in the presence of out-of-order store issuing

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Dual path instruction processing

ICS '02 Proceedings of the 16th international conference on Supercomputing
Increasing processor performance by implementing deeper pipelines

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Advanced Computer Architectures

Advanced Computer Architectures
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Automatic Generation of Microarchitecture Simulators

ICCL '98 Proceedings of the 1998 International Conference on Computer Languages
Dynamic Branch Prediction with Perceptrons

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
An analysis of a resource efficient checkpoint architecture

ACM Transactions on Architecture and Code Optimization (TACO)
Reducing Branch Misprediction Penalty via Selective Branch Recovery

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research

IEEE Computer Architecture Letters

BranchTap: improving performance with very few checkpoints through adaptive speculation control

Proceedings of the 20th annual international conference on Supercomputing
Hiding the misprediction penalty of a resource-efficient high-performance processor

ACM Transactions on Architecture and Code Optimization (TACO)
Improving single-thread performance with fine-grain state maintenance

Proceedings of the 5th conference on Computing frontiers
Superscalar architecture design for high performance DSP operations

Microprocessors & Microsystems
A distributed processor state management architecture for large-window processors

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Turbo-ROB: a low cost checkpoint/restore accelerator

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
A power-aware hybrid RAM-CAM renaming mechanism for fast recovery

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Achieving reliable system performance by fast recovery of branch miss prediction

Journal of Network and Computer Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current trends in modern out-of-order processors involve implementing deeper pipelines and a large instruction window to achieve high performance. However, as pipeline depth increases, the branch misprediction penalty becomes a critical factor in overall processor performance. Current approaches to handling branch mispredictions either incrementally roll back to in-order state by waiting until the mispredicted branch reaches the head of the reorder buffer, or utilize checkpointing at branches for faster recovery. Rolling back to in-order state stalls the pipeline for a significant number of cycles and checkpointing is costly.This paper proposes a fast recovery mechanism, called Eager Misprediction Recovery (EMR), to reduce the branch misprediction penalty. Upon a misprediction, the processor immediately starts fetching and renaming instructions from the correct path without restoring the map table. Those instructions that access incorrect speculative values wait until the correct data are restored; however, instructions that access correct values continue executing while recovery occurs. Thus, the recovery mechanism hides the latency of long branch recovery with useful instructions.EMR achieves a mean performance improvement very close to a recovery mechanism that supports checkpointing at each branch. In addition, EMR provides an average of 9.0% and up to 19.9% better performance than traditional sequential misprediction recovery on the SPEC2000 benchmark suite.