Highly concurrent scalar processing
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Comparing software and hardware schemes for reducing the cost of branches
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Predicting conditional branch directions from previous runs of a program
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A comparative analysis of schemes for correlated branch prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
A comparison of full and partial predicated execution support for ILP processors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Integrating a misprediction recovery cache (MRC) into a superscalar pipeline
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Analysis techniques for predicated code
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Increasing the instruction fetch rate via block-structured instruction set architectures
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A framework for balancing control flow and predication
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Accurate indirect branch prediction
Proceedings of the 25th annual international symposium on Computer architecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture
Proceedings of the 25th annual international symposium on Computer architecture
Predicting indirect branches via data compression
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Control Flow Prediction Schemes for Wide-Issue Superscalar Processors
IEEE Transactions on Parallel and Distributed Systems
The Partial Reverse If-Conversion Framework for Balancing Control Flow and Predication
International Journal of Parallel Programming
Using profiling to reduce branch misprediction costs on a dynamically scheduled processor
Proceedings of the 14th international conference on Supercomputing
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Path Analysis and Renaming for Predicated Instruction Scheduling
International Journal of Parallel Programming
The Misprediction Recovery Cache
International Journal of Parallel Programming
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures
International Journal of Parallel Programming
An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors
International Journal of Parallel Programming
The IA-64 Architecture at Work
Computer
IEEE Micro
The Intel IA-64 Compiler Code Generator
IEEE Micro
Multiscalar Execution along a Single Flow of Control
ICPP '97 Proceedings of the international Conference on Parallel Processing
Hybrid Predication Model for Instruction Level Parallelism
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Incorporating Predicate Information into Branch Predictors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Selective Guarded Execution Using Profiling on a Dynamically Scheduled Processor
IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Custom Wide Counterflow Pipelines for High-Performance Embedded Applications
IEEE Transactions on Computers
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
2D-Profiling: Detecting Input-Dependent Branches with a Single Input Data Set
Proceedings of the International Symposium on Code Generation and Optimization
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
RIMP: runtime implicit predication
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Hi-index | 0.00 |
Branch instructions are recognized as a major impediment to exploiting instruction level parallelism. Even with sophisticated branch prediction techniques, many frequently executed branches remain difficult to predict. An architecture supporting predicated execution may allow the compiler to remove many of these hard-to-predict branches, reducing the number of branch mispredictions and thereby improving performance. We present an in-depth analysis of the characteristics of those branches which are frequently mispredicted and examine the effectiveness of an advanced compiler to eliminate these branches. Over the benchmarks studied, an average of 27% of the dynamic branches and 56% of the dynamic branch mispredictions are eliminated with predicated execution support.