Instruction scheduling beyond basic blocks
IBM Journal of Research and Development
Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving the accuracy of dynamic branch prediction using branch correlation
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Characterizing the impact of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The multiscalar architecture
Enhancing instruction scheduling with a block-structured ISA
International Journal of Parallel Programming
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Dynamic path-based branch correlation
Proceedings of the 28th annual international symposium on Microarchitecture
Multiple-block ahead branch predictors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Control flow prediction for dynamic ILP processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Control Flow Prediction with Unbalanced Tree-like Subgraphs
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
A Study of Tree-Based Control Flow Predicti on Schemes
HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
Performance improvement with circuit-level speculation
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Wide and efficient trace prediction using the local trace predictor
Proceedings of the 20th annual international conference on Supercomputing
Hi-index | 0.00 |
In order to achieve high performance, wide-issue superscalar processors have to fetch a large number of instructions per cycle. Conditional branches are the primary impediment to increasing the fetch bandwidth because they can potentially alter the flow of control and are very frequent. To overcome this problem, these processors need to predict the outcome of multiple branches in a cycle. This paper investigates two control flow prediction schemes that predict the effective outcome of multiple branches with the help of a single prediction. Instead of considering branches as the basic units of prediction, these schemes consider subgraphs of the control flow graph of the executed program as the basic units of prediction and predict the target of an entire subgraph at a time, thereby allowing the superscalar fetch mechanism to go past multiple branches in a cycle. The first control flow prediction scheme investigated considers sequential block-like subgraphs and the second scheme considers tree-like subgraphs to make the control flow predictions. Both schemes do a 1-out-of-4 prediction as opposed to the 1-out-of-2 prediction done by branch-level prediction schemes. These two schemes are evaluated using a MIPS ISA-based 12-way superscalar microarchitecture. An improvement in effective fetch size of approximately 25 percent and 50 percent, respectively, is observed over identical microprocessors that use branch-level prediction. No appreciable difference in the prediction accuracy was observed, although the control flow prediction schemes predicted 1-out-of-4 outcomes.