MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A DAG-based design approach for reconfigurable VLIW processors
DATE '99 Proceedings of the conference on Design, automation and test in Europe
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Synthesis of custom processors based on extensible platforms
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Proceedings of the tenth international symposium on Hardware/software codesign
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
The MOLEN Polymorphic Processor
IEEE Transactions on Computers
ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
An integer linear programming approach for identifying instruction-set extensions
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Exploiting pipelining to relax register-file port constraints of instruction-set extensions
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Switch Box Architectures for Three-Dimensional FPGAs
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Automatic selection of application-specific instruction-set extensions
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Area and delay trade-offs in the circuit and architecture design of FPGAs
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Automatic selection of application-specific reconfigurable processor extensions
Proceedings of the conference on Design, automation and test in Europe
Automatic Instruction-Set Extensions with the Linear Complexity Spiral Search
RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Hi-index | 0.00 |
In this paper, two general algorithms for the automatic generation of instruction-set extensions are presented. The basic instruction set of a reconfigurable architecture is specialized with new application-specific instructions. The paper proposes two methods for the generation of convex multiple input multiple output instructions, under hardware resource constraints, based on a two-step clustering process. Initially, the application is partitioned in single-output instructions of variable size and then, selected clusters are combined in convex multiple output clusters following different policies. Our results on well-known kernels show that the extended Instructions-Set allows to execute applications more efficiently and needing fewer cycles. Our results show that a significant overall application speed-up is achieved even for large kernels (for ADPCM decoder the speed-up is up to x2.2 and for TWOFISH encoder the speedup is up to x5.5).