MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Cut ranking and pruning: enabling a general and efficient FPGA mapping solution
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Instruction set extension with shadow registers for configurable processors
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
Exploiting pipelining to relax register-file port constraints of instruction-set extensions
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
DAOmap: a depth-optimal area optimization mapping algorithm for FPGA designs
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Scalable subgraph mapping for acyclic computation accelerators
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Custom-instruction synthesis for extensible-processor platforms
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Interconnect customization for a hardware fabric
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Vector Processing as a Soft Processor Accelerator
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Design-space exploration of resource-sharing solutions for custom instruction set extensions
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Power, performance and security optimized hardware design for H.264
Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Architecture support for custom instructions with memory operations
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Resource-constrained high-level datapath optimization in ASIP design
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Application-Specific Instruction-set Processors (ASIP) can improve execution speed by using custom instructions. Several ASIP design automation flows have been proposed recently. In this paper, we investigate two techniques to improve these flows, so that ASIP can be efficiently applied to simple computer architectures in embedded applications. Firstly, we efficiently generate custom instructions with multi-cycle IO (which allows multi-outputs), thus removing the constraint imposed by the ports of the register file. Secondly, we allow identical portions of different custom instructions to be shared, thus allowing more custom instructions under the same area constraint. To handle the greatly increased exploration space, we propose several heuristics to keep the problem tractable. Experimental results show that we can achieve 3x speedup in some cases