Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Alpha architecture reference manual
Alpha architecture reference manual
Predicting conditional branch directions from previous runs of a program
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A comparison of dynamic branch predictors that use two levels of branch history
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Communications of the ACM - Special issue on computer architecture
Architecture of the Pentium Microprocessor
IEEE Micro
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Integrating a misprediction recovery cache (MRC) into a superscalar pipeline
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Analysis techniques for predicated code
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A framework for balancing control flow and predication
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture
Proceedings of the 25th annual international symposium on Computer architecture
The Partial Reverse If-Conversion Framework for Balancing Control Flow and Predication
International Journal of Parallel Programming
Using profiling to reduce branch misprediction costs on a dynamically scheduled processor
Proceedings of the 14th international conference on Supercomputing
Offline program re-mapping to improve branch prediction efficiency in embedded systems
ASP-DAC '00 Proceedings of the 2000 Asia and South Pacific Design Automation Conference
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Path Analysis and Renaming for Predicated Instruction Scheduling
International Journal of Parallel Programming
The Misprediction Recovery Cache
International Journal of Parallel Programming
Incorporating Predicate Information into Branch Predictors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Selective Guarded Execution Using Profiling on a Dynamically Scheduled Processor
IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
2D-Profiling: Detecting Input-Dependent Branches with a Single Input Data Set
Proceedings of the International Symposium on Code Generation and Optimization
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors
Proceedings of the International Symposium on Code Generation and Optimization
RIMP: runtime implicit predication
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Hi-index | 0.00 |
High performance architectures have always had to deal with the performance limiting impact of branch operations. Microprocessor designs are going to have to deal with this problem as well, as they move towards deeper pipelines and support for multiple instruction issue. Branch prediction schemes are often used to alleviate the negative impact of branch operations by allowing the speculative execution of instructions after an unresolved branch. Another technique is to eliminate branch instructions altogether. Predication can remove forward branch instructions by translating the instructions following the branch into predicate form.This paper analyzes a variety of existing predication models for eliminating branch operations, and the effect that this elimination has on the branch prediction schemes in existing processors, including single issue architectures with simple prediction mechanisms, to the newer multi-issue designs with correspondingly more sophisticated branch predictors. The effect on branch prediction accuracy, branch penalty and basic block size is studied.