Instruction set definition and instruction selection for ASIPs
ISSS '94 Proceedings of the 7th international symposium on High-level synthesis
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Designing domain-specific processors
Proceedings of the ninth international symposium on Hardware/software codesign
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
Proceedings of the conference on Design, automation and test in Europe: Proceedings
An efficient framework for dynamic reconfiguration of instruction-set customization
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
A Novel Approach to Compute Spatial Reuse in the Design of Custom Instructions
VLSID '08 Proceedings of the 21st International Conference on VLSI Design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
Aggressive embedded processors are often equipped with general purpose cores and special purpose acceleration logics. In our paper, we consider a reconfigurable processor that consists of very long instruction word (VLIW) cores and coarse grained reconfigurable arrays (CGRAs). CGRAs are particularly used to enhance the performance by exploiting loop parallelism, while VLIW cores rely on discovering instruction level parallelism. For time consuming loops, CGRAs can accelerate them with powerful pipeline scheduling. However, not all loops can be accelerated by CGRAs. Outer loops and loops containing function calls cannot be candidates for CGRA acceleration. In our paper, we adopt instruction extensions to convert code fragments in outer loops and simple functions into single instructions. With the extended instructions in CGRAs, more loops can be accelerated with CGRAs. Our experiment with mpeg2dec from Mediabench shows 32% performance increase.