Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Compact representations for control dependence
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Architecture and implementation of a VLIW supercomputer
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
Alpha AXP architecture reference manual (2nd ed.)
Alpha AXP architecture reference manual (2nd ed.)
A comparison of full and partial predicated execution support for ILP processors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Memory dependence prediction using store sets
Proceedings of the 25th annual international symposium on Computer architecture
Clustered VLIW architecture with predicated switching
Proceedings of the 38th annual Design Automation Conference
Increasing processor performance by implementing deeper pipelines
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Enhancing loop buffering of media and telecommunications applications using low-overhead predication
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
The Alpha 21264 Microprocessor
IEEE Micro
Itanium Processor Microarchitecture
IEEE Micro
The Intel IA-64 Compiler Code Generator
IEEE Micro
Register Renaming and Scheduling for Dynamic Execution of Predicated Code
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Predicate prediction for efficient out-of-order execution
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
ACM SIGBED Review - Special issue: The work-in-progress (WIP) session of the RTSS 2005
Retargetable code optimization for predicated execution
Proceedings of the conference on Design, automation and test in Europe
Symbolic crosschecking of floating-point and SIMD code
Proceedings of the sixth conference on Computer systems
RIMP: runtime implicit predication
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Power-Efficient Predication Techniques for Acceleration of Control Flow Execution on CGRA
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Predicated execution can eliminate hard to predict branches and help to enable instruction level parallelism. Many current predication variants exist where the result update is conditional based upon the outcome of the guarding predicate. However, conditional writing of a register creates a naming problem for an out-of-order processor, and can stall the issuing of instructions. This problem arises from potential multiple predicated definitions reaching a use, which is unresolved until the prior predicate values are computed.In this paper we focus on a light-weight form of predication, Phi-Predication, where all predicated instructions write a result value to their register regardless of the predicate value (i.e. even if it is false). Therefore, the predicate does not guard the writing of the result register; it instead acts as a form of selection between two input registers. This eliminates the naming problem for an out-of-order processor. Our Phi-Predicated ISA is derived from the predicated features of the Multiflow ISA, with extensions to efficiently predicate complex control flow. Our compiler modifications also expand upon prior techniques to provide efficient code generation. We examine the use of Phi-Predication for an in-order and out-of-order architecture and compare its performance to using select-op and IA64 ISA predication.