A novel architecture for ahead branch prediction

Authors:
Wenbing Jin;Feng Shi;Qiugui Song;Yang Zhang
Affiliations:
School of Compnter Science and Technology, Beijing Institute of Technology, Beijing, China 100081 and Department of Trade and Military Industry, North Automatic Control Technology Institute, Taiyu ...;School of Compnter Science and Technology, Beijing Institute of Technology, Beijing, China 100081;Department of Trade and Military Industry, North Automatic Control Technology Institute, Taiyuan, China 030006;School of Compnter Science and Technology, Beijing Institute of Technology, Beijing, China 100081 and School of Information Science and Engineering, Hebei University of Science and Technology, Shi ...
Venue:
Frontiers of Computer Science: Selected Publications from Chinese Universities
Year:
2013

Citing 16
Cited 0

Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
The impact of delay on the design of branch predictors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Design tradeoffs for the Alpha EV8 conditional branch predictor

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Reconsidering Complex Branch Predictors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Effective ahead pipelining of instruction block address generation

Proceedings of the 30th annual international symposium on Computer architecture
Adapting branch-target buffer to improve the target predictability of java code

ACM Transactions on Architecture and Code Optimization (TACO)
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
The M5 Simulator: Modeling Networked Systems

IEEE Micro
VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization

Proceedings of the 34th annual international symposium on Computer architecture
Dynamic Predication of Indirect Jumps

IEEE Computer Architecture Letters
Improving the performance of object-oriented languages with dynamic predication of indirect jumps

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Predictor virtualization

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Phantom-BTB: a virtualized branch target buffer design

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
The gem5 simulator

ACM SIGARCH Computer Architecture News

Quantified Score

Hi-index	0.00

Visualization

Abstract

In theory, branch predictors with more complicated algorithms and larger data structures provide more accurate predictions. Unfortunately, overly large structures and excessively complicated algorithms cannot be implemented because of their long access delay. To date, many strategies have been proposed to balance delay with accuracy, but none has completely solved the issue. The architecture for ahead branch prediction (A2BP) separates traditional predictors into two parts. First is a small table located at the front-end of the pipeline, which makes the prediction brief enough even for some aggressive processors. Second, operations on complicated algorithms and large data structures for accurate predictions are all moved to the back-end of the pipeline. An effective mechanism is introduced for ahead branch prediction in the back-end and small table update in the front. To substantially improve prediction accuracy, an indirect branch prediction algorithm based on branch history and target path (BHTP) is implemented in A2BP. Experiments with the standard performance evaluation corporation (SPEC) benchmarks on gem5/SimpleScalar simulators demonstrate that A2BP improves average performance by 2.92% compared with a commonly used branch target buffer-based predictor. In addition, indirect branch misses with the BHTP algorithm are reduced by an average of 28.98% compared with the traditional algorithm.