Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
IEEE Transactions on Computers
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
DSD '02 Proceedings of the Euromicro Symposium on Digital Systems Design
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Superword-Level Parallelism in the Presence of Control Flow
Proceedings of the international symposium on Code generation and optimization
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
High-level synthesis challenges and solutions for a dynamically reconfigurable processor
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Proceedings of the conference on Design, automation and test in Europe
A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
EPIMap: using epimorphism to map applications on CGRAs
Proceedings of the 49th Annual Design Automation Conference
State-based full predication for low power coarse-grained reconfigurable architecture
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Evaluator-executor transformation for efficient pipelining of loops with conditionals
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Predication is an essential technique to accelerate kernels with control flow on CGRAs. While state-based full predication (SFP) can remove wasteful power consumption on issuing/decoding instructions from conventional full predication, generating code for SFP is challenging for general CGRAs, especially when there are multiple conditionals to be handled due to exploiting data level parallelism. In this paper, we present a novel compiler framework addressing central issues such as how to express the parallelism between multiple conditionals, and how to allocate resources to them to maximize the parallelism. In particular, by separating the handling of control flow and data flow, our framework can be integrated with conventional mapping algorithms for mapping data flow. Experimental results demonstrate that our framework can find and exploit parallelism between multiple conditionals, thereby leading to 2.21 times higher performance on average than a naive approach.