Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Analysis techniques for predicated code
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Global predicate analysis and its application to register allocation
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
The program decision logic approach to predicated execution
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Accurate and efficient predicate analysis with binary decision diagrams
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Introducing the IA-64 Architecture
IEEE Micro
Itanium Processor Microarchitecture
IEEE Micro
The Intel IA-64 Compiler Code Generator
IEEE Micro
Predicated Static Single Assignment
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Register Renaming and Scheduling for Dynamic Execution of Predicated Code
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Dynamic Data Dependence Tracking and its Application to Branch Prediction
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Hi-index | 0.00 |
Predicated Execution has been put forth as a method for improving processor performance by removing hard-to-predict branches. As part of the process of turning a set of basic blocks into a predicated region, both paths of a branch are combined into a single path. There can be multiple definitions from disjoint paths that reach a use. Waiting to find out the correct definition that actually reaches the use can cause pipeline stalls.In this paper we examine a hardware optimization that dynamically collects and analyzes path information to determine valid dependences for predicated regions of code. We then use this information for an in-order VLIW predicated processor, so that instructions can continue towards execution without having to wait on operands from false dependences. Our results show that using our Disjoint Path Analysis System provides speedups over 6% and elimination of false RAW dependences of up to 14% due to the detection of erroneous dependences in if-converted regions of code.