Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Proceedings of the 24th annual international symposium on Computer architecture
Trading conflict and capacity aliasing in conditional branch predictors
Proceedings of the 24th annual international symposium on Computer architecture
The effect of instruction fetch bandwidth on value prediction
Proceedings of the 25th annual international symposium on Computer architecture
Improving branch predictors by correlating on data values
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Predicting Conditional Branches With Fusion-Based Hybrid Predictors
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Dynamic Data Dependence Tracking and its Application to Branch Prediction
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Control-Flow Speculation through Value Prediction for Superscalar Processors
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Effective ahead pipelining of instruction block address generation
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 30th annual international symposium on Computer architecture
Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors
IEEE Computer Architecture Letters
Core architecture optimization for heterogeneous chip multiprocessors
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Amdahl's Law in the Multicore Era
Computer
Accelerating critical section execution with asymmetric multi-core architectures
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
The significance of affectors and affectees correlations for branch prediction
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
CVP: an energy-efficient indirect branch prediction with compiler-guided value pattern
Proceedings of the 26th ACM international conference on Supercomputing
Toward virtualizing branch direction prediction
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Branches that depend directly or indirectly on load instructions are a leading cause of mispredictions by state-of-the-art branch predictors. For a branch of this type, there is a unique dynamic instance of the branch for each unique combination of producer-load addresses. Based on this definition, a study of mispredictions reveals two related problems: (i) Global branch history often fails to distinguish between different dynamic branches. In this case, the predictor is unable to specialize predictions for different dynamic branches, causing mispredictions if their outcomes differ. Ideally, the remedy is to predict a dynamic branch using its program counter (PC) and the addresses of its producer loads, since this context uniquely identifies the dynamic branch. We call this context the identity, or ID, of the dynamic branch. In general, producer loads are unlikely to have generated their addresses when the dynamic branch is fetched. We show that the ID of a distant retired branch in the global branch stream combined with recent global branch history, is effective context for predicting the current branch. (ii) Fixing the first problem exposes another problem. A store to an address on which a dynamic branch depends may flip its outcome when it is next encountered. With conventional passive updates, the branch suffers a misprediction before the predictor is retrained. We propose that stores to the memory addresses on which a dynamic branch depends, directly update its prediction in the predictor. This novel "active update" concept avoids mispredictions that are otherwise incurred by conventional passive training. We highlight two practical features that enable large EXACT predictors: the prediction path is scalably pipelinable by virtue of its decoupled indexing strategy, and active updates are tolerant of 100s of cycles of latency making it ideal for virtualizing this component in the general-purpose memory hierarchy. We also present a compact form of the predictor that caches only dynamic instances of a static branch that differ from its overall bias.