Greed is good: approximating independent sets in sparse and bounded-degree graphs
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Reconfigurable Instruction Set Processors from a Hardware/Software Perspective
IEEE Transactions on Software Engineering
Efficient instruction encoding for automatic instruction set design of configurable ASIPs
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
FPGA resource and timing estimation from Matlab execution traces
Proceedings of the tenth international symposium on Hardware/software codesign
A graph covering algorithm for a coarse grain reconfigurable system
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Accurate Area and Delay Estimators for FPGAs
Proceedings of the conference on Design, automation and test in Europe
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
An area estimation methodology for FPGA based designs at systemc-level
Proceedings of the 41st annual Design Automation Conference
Characterizing embedded applications for instruction-set extensible processors
Proceedings of the 41st annual Design Automation Conference
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
Modeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors
DELTA '06 Proceedings of the Third IEEE International Workshop on Electronic Design, Test and Applications
DAOmap: a depth-optimal area optimization mapping algorithm for FPGA designs
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Compile-time area estimation for LUT-based FPGAs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Optimal simultaneous mapping and clustering for FPGA delay optimization
Proceedings of the 43rd annual Design Automation Conference
Automatic selection of application-specific instruction-set extensions
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Architecture and compiler optimizations for data bandwidth improvement in configurable processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Rapid design of area-efficient custom instructions for reconfigurable embedded processing
Journal of Systems Architecture: the EUROMICRO Journal
Recurrence-aware instruction set selection for extensible embedded processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Customizing the datapath and ISA of soft VLIW processors
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Selecting profitable custom instructions for reconfigurable processors
Journal of Systems Architecture: the EUROMICRO Journal
Architecture-Aware Technique for Mapping Area-Time Efficient Custom Instructions onto FPGAs
IEEE Transactions on Computers
Bitwidth cognizant architecture synthesis of custom hardware accelerators
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Custom-instruction synthesis for extensible-processor platforms
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Design Space Pruning Through Early Estimations of Area/Delay Tradeoffs for FPGA Implementations
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
CHIPS: Custom Hardware Instruction Processor Synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
The main aim of this article is to demonstrate that a fast and accurate FPGA estimation engine is indispensable in design flows for custom instruction (template) selection. The need for a FPGA estimation engine stems from the difficulty in predicting the FPGA performance measures of selected custom instructions. We will present a FPGA estimation technique that partitions the high-level representation of custom instructions into clusters based on the structural organization of the target FPGA, while taking into account general logic synthesis principles adopted by FPGA tools. In this work, we have evaluated a widely used graph covering algorithm with various heuristics for custom instruction selection. In addition, we present an algorithm called Refined Largest Fit First (RLFF) that relies on a graph covering heuristic to select non-overlapping superset templates, which typically incorporate frequently used basic templates. The initial solution is further refined by considering overlapping templates that were ignored previously to see if their introduction could lead to higher performance. While RLFF provides the most efficient cover compared to the ILP method and other graph covering heuristics, FPGA estimation results reveals that RLFF leads to the worst performance in certain applications. It is therefore a worthy proposition to equip design flows with accurate FPGA estimation in order to rapidly determine the most profitable custom instruction approach for a given application.