Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The effects of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Characterizing the impact of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A comparison of full and partial predicated execution support for ILP processors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Analysis techniques for predicated code
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Global predicate analysis and its application to register allocation
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture
Proceedings of the 25th annual international symposium on Computer architecture
Wavefront scheduling: path based data representation and scheduling of subgraphs
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Using profiling to reduce branch misprediction costs on a dynamically scheduled processor
Proceedings of the 14th international conference on Supercomputing
Communications of the ACM - Special issue on computer architecture
An integrated approach to accelerate data and predicate computations in hyperblocks
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Treegion Scheduling for Wide Issue Processors
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Just-In-Time Java? Compilation for the Itanium® Processor
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Phi-Predication for light-weight if-conversion
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Incorporating Predicate Information into Branch Predictors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Field-testing IMPACT EPIC research results in Itanium 2
Proceedings of the 31st annual international symposium on Computer architecture
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Architecture Design for Soft Errors
Architecture Design for Soft Errors
Power-Efficient Predication Techniques for Acceleration of Control Flow Execution on CGRA
ACM Transactions on Architecture and Code Optimization (TACO)
State-based full predication for low power coarse-grained reconfigurable architecture
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
The research community has studied if-conversion for many years. However, due to the lack of existing hardware, studies were conducted by simulating code generated by experimental compilers. This paper presents the first comprehensive study of the use of predication to implement if-conversion on production hardware with a near-production compiler. To better understand trends in the measurements, we generated binaries at three increasing levels of if-conversion aggressiveness. For each level, we gathered data regarding the global runtime effects of if-conversion on overall execution time, register pressure, code size, and branch behavior. Furthermore, we studied the inherent characteristics of program control-flow structure related to branching to help determine fundamental limits of if-conversion. Our results show that on the Itanium™ processor if-conversion could potentially remove 29% of the branch mispredictions in SPEC2OOOCINT but that this accounts for a substantially smaller overall program speedup than previously reported.