Parallelization of loops with exits on pipelined architectures
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Avoiding unconditional jumps by code replication
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Eliminating branches using a superoptimizer and the GNU C compiler
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Predicting conditional branch directions from previous runs of a program
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
The Journal of Supercomputing - Special issue on instruction-level parallelism
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Avoiding conditional branches by code replication
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Critical path reduction for scalar programs
Proceedings of the 28th annual international symposium on Microarchitecture
Analysis techniques for predicated code
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Interprocedural conditional branch elimination
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Improving performance by branch reordering
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Structure of Computers and Computations
Structure of Computers and Computations
Acceleration of First and Higher Order Recurrences on Processors with Instruction Level Parallelism
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Treegion Scheduling for Wide Issue Processors
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
An integrated approach to accelerate data and predicate computations in hyperblocks
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Accurate and efficient predicate analysis with binary decision diagrams
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Efficient and effective branch reordering using profile data
ACM Transactions on Programming Languages and Systems (TOPLAS)
Path Analysis and Renaming for Predicated Instruction Scheduling
International Journal of Parallel Programming
A Trace Based Evaluation of Speculative Branch Decoupling
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Branch elimination by condition merging
Software—Practice & Experience
Hi-index | 0.00 |
The challenge of exploiting high degrees of instruction-level parallelism is often hampered by frequent branching. Both exposed branch latency and low branch throughput can restrict parallelism. Control critical path reduction (control CPR) is a compilation technique to address these problems. Control CPR can reduce the dependence height of critical paths through branch operations as well as decrease the number of executed branches. In this paper, we present an approach to control CPR that recognizes sequences of branches using profiling statistics. The control CPR transformation is applied to the predominant path through this sequence. Our approach, its implementation, and experimental results are presented. This work demonstrates that control CPR enhances instruction-level parallelism for a variety of application programs and improves their performance across a range of processors.