Programmable arithmetic devices for high speed digital signal processing
Programmable arithmetic devices for high speed digital signal processing
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
A datapath synthesis system for the reconfigurable datapath architecture
ASP-DAC '95 Proceedings of the 1995 Asia and South Pacific Design Automation Conference
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Design and Implementation of the MorphoSys Reconfigurable ComputingProcessor
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Design Challenges of Technology Scaling
IEEE Micro
Compilation Approach for Coarse-Grained Reconfigurable Architectures
IEEE Design & Test
ISVLSI '03 Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03)
Network Topology Exploration of Mesh-Based Coarse-Grain Reconfigurable Architectures
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Edge-centric modulo scheduling for coarse-grained reconfigurable architectures
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
SPR: an architecture-adaptive CGRA mapping tool
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Architecture enhancements for the ADRES coarse-grained reconfigurable array
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Compilation techniques for CGRAs: exploring all parallelization approaches
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A pattern selection algorithm for multi-pattern scheduling
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Compiling control-intensive loops for CGRAs with state-based full predication
Proceedings of the Conference on Design, Automation and Test in Europe
REGIMap: register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs)
Proceedings of the 50th Annual Design Automation Conference
Polyhedral model based mapping optimization of loop nests for CGRAs
Proceedings of the 50th Annual Design Automation Conference
UNTANGLED: A Game Environment for Discovery of Creative Mapping Strategies
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Evaluator-executor transformation for efficient pipelining of loops with conditionals
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Coarse-Grained Reconfigurable Architectures (CGRAs) are an attractive platform that promise simultaneous high-performance and high power-efficiency. One of the primary challenges in using CGRAs is to develop efficient compilers that can automatically and efficiently map applications to the CGRA. To this end, this paper makes several contributions: i) Using Re-computation for Resource Limitations: For the first time in CGRA compilers, we propose the use of re-computation as a solution for resource limitation problem. This extends the solutions space, and enables better mappings, ii) General Problem Formulation: A precise and general formulation of the application mapping problem on a CGRA is presented, and its computational complexity is established. iii) Extracting an Efficient Heuristic: Using the insights from the problem formulation, we design an effective global heuristic called EPIMap. EPIMap transforms the input specification (a directed graph) to an Epimorphic equivalent graph that satisfies the necessary conditions for mapping on to a CGRA, reducing the search space. Experimental results on 14 important kernels extracted from well known benchmark programs show that using EPIMap can improve the performance of the kernels on CGRA by more than 2.8X on average, as compared to one of the best existing mapping algorithm, EMS. EPIMap was able to achieve the theoretical best performance for 9 out of 14 benchmarks, while EMS could not achieve the theoretical best performance for any of the benchmarks. EPIMap achieves better mappings at acceptable increase in the compilation time.