Multiple-block ahead branch predictors

Authors:
André Seznec;Stéphan Jourdan;Pascal Sainrat;Pierre Michaud
Affiliations:
IRISA, Campus de Beaulieu, 35042 Rennes, France;IRIT, Université Paul Sabatier, 31062 Toulouse, France;IRIT, Université Paul Sabatier, 31062 Toulouse, France;IRISA, Campus de Beaulieu, 35042 Rennes, France
Venue:
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Year:
1996

Citing 10
Cited 29

An investigation of the performance of various dynamic scheduling techniques

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache

ICS '93 Proceedings of the 7th international conference on Supercomputing
IBM Power and PowerPC

IBM Power and PowerPC
Two-level adaptive branch prediction and instruction fetch mechanisms for high performance superscalar processors

Two-level adaptive branch prediction and instruction fetch mechanisms for high performance superscalar processors
Next cache line and set prediction

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Optimization of instruction fetch mechanisms for high issue rates

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Control flow prediction with tree-like subgraphs for superscalar processors

Proceedings of the 28th annual international symposium on Microarchitecture
An investigation of the performance of various instruction-issue buffer topologies

Proceedings of the 28th annual international symposium on Microarchitecture
Don't use the page number, but a pointer to it

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Control flow prediction for dynamic ILP processors

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture

Increasing the instruction fetch rate via block-structured instruction set architectures

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Selective eager execution on the PolyPath architecture

Proceedings of the 25th annual international symposium on Computer architecture
A Trace Cache Microarchitecture and Evaluation

IEEE Transactions on Computers - Special issue on cache memory and related problems
Evaluation of Design Options for the Trace Cache Fetch Mechanism

IEEE Transactions on Computers - Special issue on cache memory and related problems
Correlated load-address predictors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
The block-based trace cache

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A scalable front-end architecture for fast instruction delivery

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Control Flow Prediction Schemes for Wide-Issue Superscalar Processors

IEEE Transactions on Parallel and Distributed Systems
Fetch directed instruction prefetching

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Completion time multiple branch prediction for enhancing trace cache performance

Proceedings of the 27th annual international symposium on Computer architecture
The impact of delay on the design of branch predictors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Optimizations Enabled by a Decoupled Front-End Architecture

IEEE Transactions on Computers
Increasing processor performance by implementing deeper pipelines

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Design tradeoffs for the Alpha EV8 conditional branch predictor

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Two cache lines prediction for a wide-issue micro-architecture

ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures

International Journal of Parallel Programming
An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors

International Journal of Parallel Programming
Software Trace Cache for Commercial Applications

International Journal of Parallel Programming
Putting Data Value Predictors to Work in Fine-Grain Parallel Processors

HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
The Case for Speculative Multithreading on SMT Processors

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Fetching instruction streams

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reconsidering Complex Branch Predictors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Effective ahead pipelining of instruction block address generation

Proceedings of the 30th annual international symposium on Computer architecture
Merging path and gshare indexing in perceptron branch prediction

ACM Transactions on Architecture and Code Optimization (TACO)
Block-aware instruction set architecture

ACM Transactions on Architecture and Code Optimization (TACO)
Wide and efficient trace prediction using the local trace predictor

Proceedings of the 20th annual international conference on Supercomputing
Evaluating trace cache energy efficiency

ACM Transactions on Architecture and Code Optimization (TACO)
Trace Cache Miss Rate

International Journal of Modelling and Simulation
FabScalar: composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template

Proceedings of the 38th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

A basic rule in computer architecture is that a processor cannot execute an application faster than it fetches its instructions. This paper presents a novel cost-effective mechanism called the two-block ahead branch predictor. Information from the current instruction block is not used for predicting the address of the next instruction block, but rather for predicting the block following the next instruction block.This approach overcomes the instruction fetch bottle-neck exhibited by wide-dispatch "brainiac" processors by enabling them to efficiently predict addresses of two instruction blocks in a single cycle. Furthermore, pipelining the branch prediction process can also be done by means of our predictor for "speed demon" processors to achieve higher clock rate or to improve the prediction accuracy by means of bigger prediction structures.Moreover, and unlike the previously-proposed multiple predictor schemes, multiple-block ahead branch predictors can use any of the branch prediction schemes to perform the very accurate predictions required to achieve high-performance on superscalar processors.