Highly concurrent scalar processing
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Branch classification: a new mechanism for improving branch predictor performance
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The effects of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Characterizing the impact of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Software pipelining showdown: optimal vs. heuristic methods in a production compiler
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A Study of Control Independence in Superscalar Processors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Dynamic Hammock Predication for Non-Predicated Instruction Set Architectures
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A New Framework for Integrated Global Local Scheduling
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Measuring the Parallelism Available for Very Long Instruction Word Architectures
IEEE Transactions on Computers
Hi-index | 0.00 |
Modern dynamically scheduled processors use branch prediction hardware to speculatively fetch and execute most likely executed paths in a program. Complex branch predictors have been proposed which attempt to identify these paths accurately such that the hardware can benefit from out-of-order (OOO) execution. Recent studies have shown that in spite of such complex prediction schemes, there still exist many frequently executed branches which are difficult to predict. Predicated execution has been proposed as an alternative technique to eliminate some of these branches in various forms ranging from a restrictive support to a full-blown support. We call the restrictive form of predicated execution as guarded execution.In this paper, we propose a new algorithm, which uses profiling and selectively performs if-conversion for architectures with guarded execution support. Branch profiling is used to gather the taken, non-taken and misprediction counts for every branch. This combined with block profiling is used to select paths, which suffer from heavy mispredictions and are profitable to if-convert. Effects of three different selection criteria, namely size-based, predictability-based and profiled-based, on net cycle improvements, branch mispredictions and mis-speculated instructions are then studied. We also explain numerous adjustments that were made to the selection criteria to better reflect the OOO processor behavior.