Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The effects of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Characterizing the impact of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A framework for balancing control flow and predication
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Multipath execution: opportunities and limits
ICS '98 Proceedings of the 12th international conference on Supercomputing
Confidence estimation for speculation control
Proceedings of the 25th annual international symposium on Computer architecture
Selective eager execution on the PolyPath architecture
Proceedings of the 25th annual international symposium on Computer architecture
Reducing branch misprediction penalties via dynamic control independence detection
ICS '99 Proceedings of the 13th international conference on Supercomputing
Control independence in trace processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Instruction fetch mechanisms for multipath execution processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
IEEE Transactions on Computers
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Increasing processor performance by implementing deeper pipelines
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Skipper: a microarchitecture for exploiting control-flow independence
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
A Study of Control Independence in Superscalar Processors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Incorporating Predicate Information into Branch Predictors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamic Hammock Predication for Non-Predicated Instruction Set Architectures
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Register Renaming and Scheduling for Dynamic Execution of Predicated Code
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Dynamic Branch Prediction with Perceptrons
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Microarchitecture Optimizations for Exploiting Memory-Level Parallelism
Proceedings of the 31st annual international symposium on Computer architecture
Field-testing IMPACT EPIC research results in Itanium 2
Proceedings of the 31st annual international symposium on Computer architecture
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Control Flow Optimization Via Dynamic Reconvergence Prediction
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Analysis of the O-GEometric History Length Branch Predictor
Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing Branch Misprediction Penalty via Selective Branch Recovery
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
High-Performance Throughput Computing
IEEE Micro
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Ginger: control independence using tag rewriting
Proceedings of the 34th annual international symposium on Computer architecture
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors
Proceedings of the International Symposium on Code Generation and Optimization
Improving the performance of object-oriented languages with dynamic predication of indirect jumps
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Mixed speculative multithreaded execution models
ACM Transactions on Architecture and Code Optimization (TACO)
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Therefore, DMP can dynamically predicate a much larger set of branches than simple hammock branches. This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Therefore, DMP can dynamically predicate a much larger set of branches than simple hammock branches.