Hierarchical interconnection structures for field programmable gate arrays
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
IEEE Transactions on Computers
Math toolkit for real-time programming
Math toolkit for real-time programming
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The Garp Architecture and C Compiler
Computer
Synthesis of custom processors based on extensible platforms
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
PACT XPP—A Self-Reconfigurable Data Processing Architecture
The Journal of Supercomputing
System-Level Modeling of Dynamically Reconfigurable Hardware with SystemC
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Automatic compilation to a coarse-grained reconfigurable system-opn-chip
ACM Transactions on Embedded Computing Systems (TECS)
Polymorphous fabric-based systems: model, tools, applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable systems
Realization of wireless multimedia communication systems on reconfigurable platforms
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable systems
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
The chimaera reconfigurable functional unit
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
INSIDE: INstruction Selection/Identification & Design Exploration for Extensible Processors
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Scalable Processor Instruction Set Extension
IEEE Design & Test
Alleviating the Data Memory Bandwidth Bottleneck in Coarse-Grained Reconfigurable Arrays
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
A Constraints Programming Approach for Fabric Cell Synthesis
DSD '05 Proceedings of the 8th Euromicro Conference on Digital System Design
Performance optimization using template mapping for datapath-intensive high-level synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
In this paper, an embedded system that extends microprocessor cores with a high-performance coarse-grained reconfigurable data-path is introduced. The data-path have been previously introduced by the authors. It is composed by computational resources able to realize complex operations which aid in improving the performance of time critical application parts, called kernels. A compilation flow is defined for mapping high-level software descriptions to the microprocessor system. The kernel code is mapped using a properly developed mapping algorithm for the reconfigurable data-path, while the non-critical segments are executed on the microprocessor. Extensive exploration is performed by mapping four real-life applications on six different instances of the system. The results show that the speedup from executing kernels on the reconfigurable logic ranges from 6.3 to 154.3, relative to the software execution on the processor since the available processing elements of the data-path are efficiently utilized. Important overall application speedups, due to the kernels' acceleration, have been reported for the four applications. These overall performance improvements range from 1.70 to 3.70 relative to an all-processor execution. Furthermore, the experiments show that the proposed data-path achieves faster kernels' execution compared with other high-performance data-paths.