Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Executing a Program on the MIT Tagged-Token Dataflow Architecture
IEEE Transactions on Computers
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Sentinel scheduling for VLIW and superscalar processors
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A framework for balancing control flow and predication
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The program decision logic approach to predicated execution
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Communications of the ACM - Special issue on computer architecture
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
The Alpha 21264 Microprocessor
IEEE Micro
A preliminary architecture for a basic data-flow processor
ISCA '75 Proceedings of the 2nd annual symposium on Computer architecture
Predicate prediction for efficient out-of-order execution
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Register Renaming and Scheduling for Dynamic Execution of Predicated Code
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE
A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE
Systematic compilation for predicated execution
Systematic compilation for predicated execution
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Compiling for EDGE Architectures
Proceedings of the International Symposium on Code Generation and Optimization
A spatial path scheduling algorithm for EDGE architectures
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Merging Head and Tail Duplication for Convergent Hyperblock Formation
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Merging Head and Tail Duplication for Convergent Hyperblock Formation
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Compiler Controlled Speculation for Power Aware ILP Extraction in Dataflow Architectures
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
An evaluation of the TRIPS computer system
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
The potential of using dynamic information flow analysis in data value prediction
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Software-based branch predication for AMD GPUs
ACM SIGARCH Computer Architecture News
Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction
ACM Transactions on Architecture and Code Optimization (TACO)
Distributed replay protocol for distributed uniprocessors
Proceedings of the 26th ACM international conference on Supercomputing
Hi-index | 0.00 |
Predication facilitates high-bandwidth fetch and large static scheduling regions, but has typically been too complex to implement comprehensively in out-of-ordermicroarchitectures. This paper describes dataflow predication, which provides per-instruction predication in a dataflow ISA, low predication computation overheads similar to VLIW ISAs, and low complexity out-of-order issue. A twobit field in each instruction specifies whether an instruction is predicated, in which case, an arriving predicate token determines whether an instruction should execute. Dataflow predication incorporates three features that reduce predication overheads. First, dataflow predicate computation permits computation of compound predicates with virtually no overhead instructions. Second, early mispredication termination squashes in-flight instructions with false predicates at any time, eliminating the overhead of falsely predicated paths. Finally, implicit predication mitigates the fanout overhead of dataflow predicates by reducing the number of explicitly predicated instructions, by predicating only the heads of dependence chains. Dataflow predication also exposes new compiler optimizations-such as disjoint instruction merging and path-sensitive predicate removal-for increased performance of predicated code in an out-of-order design.