IEEE Transactions on Computers
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
KressArray Xplorer: a new CAD environment to optimize reconfigurable datapath array
ASP-DAC '00 Proceedings of the 2000 Asia and South Pacific Design Automation Conference
Compilation Approach for Coarse-Grained Reconfigurable Architectures
IEEE Design & Test
An algorithm for mapping loops onto coarse-grained reconfigurable architectures
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Reconfigurable Instruction Set Processors: A Survey
RSP '00 Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP 2000)
Interconnect Architecture Exploration for Low-Energy Reconfigurable Single-Chip DSPs
WVLSI '99 Proceedings of the IEEE Computer Society Workshop on VLSI'99
ISVLSI '03 Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03)
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
The multi-dataflow composer tool: generation of on-the-fly reconfigurable platforms
Journal of Real-Time Image Processing
Hi-index | 0.00 |
Coarse-grained reconfigurable architectures (CGRAs) aim to achieve both goals of high performance and flexibility. In addition, power consumption is significant for the reconfigurable architecture to be used as a competitive processing core in embedded systems. However, the existing reconfigurable architectures require too much area and power. In this paper, we propose a new design space exploration flow, optimizing CGRA to reduce area and power with enhancing performance for digital signal processing (DSP) application domain. It reduces the array size through efficient arrangement of array components and customization of their interconnection, exploiting input patterns belonging to the DSP application domain. Such a design flow is based on pipelining and sharing of area/delay-critical resources in the processing element array. Experimental results show that for DSP applications, the proposed approach reduces area by up to 36.75%, average execution time by 36.78%, and average power by 31.85% when compared with the existing CGRA architecture.