IEEE Transactions on Computers
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Compilation Approach for Coarse-Grained Reconfigurable Architectures
IEEE Design & Test
PACT XPP—A Self-Reconfigurable Data Processing Architecture
The Journal of Supercomputing
ISVLSI '03 Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03)
Network Topology Exploration of Mesh-Based Coarse-Grain Reconfigurable Architectures
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
ACM Transactions on Embedded Computing Systems (TECS)
Event Semantics in Two-person Interactions
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Scalable Processor Instruction Set Extension
IEEE Design & Test
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
In this paper, we present performance results from mapping five real-world DSP applications on an embedded system-on-chip that incorporates coarse-grain reconfigurable logic with an instruction-set processor. The reconfigurable logic is realized by a 2-Dimensional Array of Processing Elements. A mapping flow for improving application's performance by accelerating critical software parts, called kernels, on the Coarse-Grain Reconfigurable Array is proposed. Profiling is performed for detecting critical kernel code. For mapping the detected kernels on the reconfigurable logic a priority-based mapping algorithm has been developed. The experiments for three different instances of a generic system show that the speedup from executing kernels on the Reconfigurable Array ranges from 9.9 to 151.1, with an average value of 54.1, relative to the kernels' execution on the processor. Important overall application speedups, due to the kernels' acceleration, have been reported for the five applications. These overall performance improvements range from 1.3 to 3.7, with an average value of 2.3, relative to an all-software execution.