Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths

Authors:
Hyesoon Kim;Jose A. Joao;Onur Mutlu;Yale N. Patt
Affiliations:
University of Texas at Austin;University of Texas at Austin;Microsoft Research;University of Texas at Austin
Venue:
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Year:
2006

Citing 39
Cited 6

Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Limits of control flow on parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effective compiler support for predicated execution using the hyperblock

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Reverse If-Conversion

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Guarded execution and branch prediction in dynamic ILP processors

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The effects of predicated execution on branch prediction

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Characterizing the impact of predicated execution on branch prediction

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Using predicated execution to improve the performance of a dynamically scheduled machine with speculative execution

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Assigning confidence to conditional branch predictions

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A framework for balancing control flow and predication

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Multipath execution: opportunities and limits

ICS '98 Proceedings of the 12th international conference on Supercomputing
Confidence estimation for speculation control

Proceedings of the 25th annual international symposium on Computer architecture
Selective eager execution on the PolyPath architecture

Proceedings of the 25th annual international symposium on Computer architecture
Reducing branch misprediction penalties via dynamic control independence detection

ICS '99 Proceedings of the 13th international conference on Supercomputing
Control independence in trace processors

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Instruction fetch mechanisms for multipath execution processors

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques

IEEE Transactions on Computers
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Increasing processor performance by implementing deeper pipelines

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Skipper: a microarchitecture for exploiting control-flow independence

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The impact of if-conversion and branch prediction on program execution on the Intel® Itanium™ processor

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Architectural Support for Compiler-Synthesized Dynamic Branch Prediction Strategies: Rationale and Initial Results

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
A Study of Control Independence in Superscalar Processors

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Incorporating Predicate Information into Branch Predictors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamic Hammock Predication for Non-Predicated Instruction Set Architectures

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Register Renaming and Scheduling for Dynamic Execution of Predicated Code

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Dynamic Branch Prediction with Perceptrons

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Microarchitecture Optimizations for Exploiting Memory-Level Parallelism

Proceedings of the 31st annual international symposium on Computer architecture
Field-testing IMPACT EPIC research results in Itanium 2

Proceedings of the 31st annual international symposium on Computer architecture
Continual flow pipelines

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Control Flow Optimization Via Dynamic Reconvergence Prediction

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Analysis of the O-GEometric History Length Branch Predictor

Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing Branch Misprediction Penalty via Selective Branch Recovery

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
High-Performance Throughput Computing

IEEE Micro
Kilo-Instruction Processors: Overcoming the Memory Wall

IEEE Micro
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research

IEEE Computer Architecture Letters

Ginger: control independence using tag rewriting

Proceedings of the 34th annual international symposium on Computer architecture
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors

Proceedings of the International Symposium on Code Generation and Optimization
Diverge-Merge Processor: Generalized and Energy-Efficient Dynamic Predication

IEEE Micro
Improving the performance of object-oriented languages with dynamic predication of indirect jumps

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Mixed speculative multithreaded execution models

ACM Transactions on Architecture and Code Optimization (TACO)
Control-Flow Decoupling

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Therefore, DMP can dynamically predicate a much larger set of branches than simple hammock branches. This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Therefore, DMP can dynamically predicate a much larger set of branches than simple hammock branches.