Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A comparison of dynamic branch predictors that use two levels of branch history
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
The effect of speculatively updating branch history on branch prediction accuracy, revisited
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Next cache line and set prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Optimization of instruction fetch mechanisms for high issue rates
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Multiple-block ahead branch predictors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Branch history table indexing to prevent pipeline bubbles in wide-issue superscalar processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Branch Target Buffer Design and Optimization
IEEE Transactions on Computers
Multiple Branch and Block Prediction
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
A Study of Tree-Based Control Flow Predicti on Schemes
HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
Hi-index | 0.00 |
Modern micro-architectures employ superscalar techniques to enhance system performance. The superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions.In this paper, we propose the Grouped Branch Prediction (GBP) that can recognize and predict multiple branches in the same instruction cache line for a wide-issue micro-architecture. Several configurations of the GBP with different group sizes are simulated. The simulation results show that the branch penalty of the group size 4 with 2048-entry is under 0.65 clock cycle. In our design, we choose the two-group scheme with group size 4. This feature achieves an average of 4.9 IPC_f (the number of instructions fetched per cycle for a machine front-end). Furthermore, we extend the GBP to achieve Two Cache Lines Predictions with two fetch units. The scheme of the 2048-entry 2-group with group.size 4 can produce an average of 8.4 IPC_f. The performance is approximately 66.5% better than the original 2-group GBP's. The added hardware cost (41.5k bits) is less than 40%.