Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Energy-efficient instruction set synthesis for application-specific processors
Proceedings of the 2003 international symposium on Low power electronics and design
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Area-efficient instruction set synthesis for reconfigurable system-on-chip designs
Proceedings of the 41st annual Design Automation Conference
Compiler based exploration of DSP energy savings by SIMD operations
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
An Energy Efficient Instruction Set Synthesis Framework for Low Power Embedded System Designs
IEEE Transactions on Computers
Exploiting pipelining to relax register-file port constraints of instruction-set extensions
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Exploring the design space of LUT-based transparent accelerators
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
A unified theory of timing budget management
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Dynamic Template Generation for Resource Sharing in Control and Data Flow Graphs
VLSID '06 Proceedings of the 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design
Rethinking custom ISE identification: a new processor-agnostic method
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Graph Matching Constraints for Synthesis with Complex Components
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Increasing data-bandwidth to instruction-set extensions through register clustering
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Efficient ASIP design for configurable processors with fine-grained resource sharing
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Resource Sharing in Custom Instruction Set Extensions
SASP '08 Proceedings of the 2008 Symposium on Application Specific Processors
Graph clustering using the weighted minimum common supergraph
GbRPR'03 Proceedings of the 4th IAPR international conference on Graph based representations in pattern recognition
ISEGEN: an iterative improvement-based ISE generation technique for fast customization of processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Efficient datapath merging for partially reconfigurable architectures
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Measuring the Gap Between FPGAs and ASICs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Introduction of Architecturally Visible Storage in Instruction Set Extensions
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Resource-constrained high-level datapath optimization in ASIP design
Proceedings of the Conference on Design, Automation and Test in Europe
Predicting best design trade-offs: a case study in processor customization
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Selective flexibility: breaking the rigidity of datapath merging
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hardware reuse in modern application-specific processors and accelerators
Microprocessors & Microsystems
Hi-index | 0.03 |
Customized processor performance generally increases as additional custom instructions are added. However, performance is not the only metric that modern systems must take into account; die area and energy efficiency are equally important. Resource sharing during synthesis of instruction set extensions (ISEs) can significantly reduce the die area and energy consumption of a customized processor. This may increase the number of custom instructions that can be synthesized with a given area budget. Resource sharing involves combining the graph representations of two or more ISEs which contain a similar subgraph. This coupling of multiple subgraphs, if performed naively, can increase the latency of the extension instructions considerably, and yet, as we show in this paper, an appropriate level of resource sharing provides a significantly simpler design with modest increases in average latency for ISEs. Our main contributions are the introduction of a parametric method for exploring the tradeoffs that can be achieved between instruction latency and implementation complexity, and the coupling of design-space exploration with fast area-delay models for the operators comprising each ISE. We present experimental evidence that our heuristic exposes a broad range of design points, allowing advantageous tradeoffs between die area and latency to be found and exploited.