Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Synthesis of pipelined instruction set processors
DAC '93 Proceedings of the 30th international Design Automation Conference
Generating instruction sets and microarchitectures from applications
ICCAD '94 Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design
Synthesis of instruction sets for pipelined microprocessors
DAC '94 Proceedings of the 31st annual Design Automation Conference
Instruction set definition and instruction selection for ASIPs
ISSS '94 Proceedings of the 7th international symposium on High-level synthesis
An ASIP instruction set optimization algorithm with functional module sharing constraint
ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
Synthesis of application specific programmable processors
DAC '97 Proceedings of the 34th annual Design Automation Conference
Advanced compiler design and implementation
Advanced compiler design and implementation
The practice of programming
Instruction set selection for ASIP design
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
An ASIP design methodology for embedded systems
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Automatic detection of recurring operation patterns
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Synthesis of Application Specific Instructions for Embedded DSP Software
IEEE Transactions on Computers
Customized instruction-sets for embedded processors
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Exploiting intellectual properties in ASIP designs for embedded DSP software
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Designing power efficient hypermedia processors
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Subsetting Behavioral Intellectual Property for Low Power ASIP Design
Journal of VLSI Signal Processing Systems - Special issue on system level design
PSCP: a scalable parallel ASIP architecture for reactive systems
Proceedings of the conference on Design, automation and test in Europe
Effectiveness of the ASIP design system PEAS-III in design of pipelined processors
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Hardware/software instruction set configurability for system-on-chip processors
Proceedings of the 38th annual Design Automation Conference
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Readings in Hardware/Software Co-Design
Readings in Hardware/Software Co-Design
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Automatic Architectural Synthesis of VLIW and EPIC Processors
Proceedings of the 12th international symposium on System synthesis
Algorithms for compiler-assisted design space exploration of clustered vliw asip datapaths
Algorithms for compiler-assisted design space exploration of clustered vliw asip datapaths
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
MINCE: Matching INstructions Using Combinational Equivalence for Extensible Processor
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Area-efficient instruction set synthesis for reconfigurable system-on-chip designs
Proceedings of the 41st annual Design Automation Conference
Introduction of local memory elements in instruction set extensions
Proceedings of the 41st annual Design Automation Conference
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
A Scalable Application-Specific Processor Synthesis Methodology
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
INSIDE: INstruction Selection/Identification & Design Exploration for Extensible Processors
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Dual-pipeline heterogeneous ASIP design
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Instruction set extension with shadow registers for configurable processors
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Synthesis of application-specific highly efficient multi-mode cores for embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Fine-grained application source code profiling for ASIP design
Proceedings of the 42nd annual Design Automation Conference
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
Code Size Reduction in Heterogeneous-Connectivity-Based DSPs Using Instruction Set Extensions
IEEE Transactions on Computers
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
Exploring the design space of LUT-based transparent accelerators
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Architecture and compilation for data bandwidth improvement in configurable embedded processors
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Customization of application specific heterogeneous multi-pipeline processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Automatic selection of application-specific instruction-set extensions
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Application specific forwarding network and instruction encoding for multi-pipe ASIPs
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Instruction set synthesis with efficient instruction encoding for configurable processors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Architecture and compiler optimizations for data bandwidth improvement in configurable processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proceedings of the 17th ACM Great Lakes symposium on VLSI
The Journal of Supercomputing
Speedups in embedded systems with a high-performance coprocessor datapath
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Proceedings of the conference on Design, automation and test in Europe
Journal of Systems Architecture: the EUROMICRO Journal
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
An architecture framework for an adaptive extensible processor
The Journal of Supercomputing
The Instruction-Set Extension Problem: A Survey
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Proceedings of the 6th ACM conference on Computing frontiers
An Application Development Framework for ARISE Reconfigurable Processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Algorithms for the automatic extension of an instruction-set
Proceedings of the Conference on Design, Automation and Test in Europe
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
ISEGEN: an iterative improvement-based ISE generation technique for fast customization of processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The Instruction-Set Extension Problem: A Survey
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
High performance and area efficient flexible DSP datapath synthesis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The ARISE approach for extending embedded processors with arbitrary hardware accelerators
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The Journal of Supercomputing
Extended Instruction Exploration for Multiple-Issue Architectures
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.01 |
Efficiency and flexibility are critical, but often conflicting, design goals in embedded system design. The recent emergence of extensible processors promises a favorable tradeoff between efficiency and flexibility, while keeping design turnaround times short. Current extensible processor design flows automate several tedious tasks, but typically require designers to manually select the parts of the program that are to be implemented as custom instructions.In this work, we describe an automatic methodology to select custom instructions to augment an extensible processor, in order to maximize its efficiency for a given application program. We demonstrate that the number of custom instruction candidates grows rapidly with program size, leading to a large design space, and that the quality (speedup) of custom instructions varies significantly across this space, motivating the need for the proposed flow. Our methodology features cost functions to guide the custom instruction selection process, as well as static and dynamic pruning techniques to eliminate inferior parts of the design space from consideration. Further, we employ a two-stage process, wherein a limited number of promising instruction candidates are first selected, and then evaluated in more detail through cycle-accurate instruction set simulation and synthesis of the corresponding hardware, to identify the custom instruction combinations that result in the highest program speedup or maximize speedup under a given area constraint.We have evaluated the proposed techniques using a state-of-the-art extensible processor platform, in the context of a commercial design flow. Experiments with several benchmark programs indicate that custom processors synthesized using automatic custom instruction selection can result in large improvements in performance (upto 5.4X, average of 3.4X), energy (upto 4.5X, average of 3.2X), and energy-delay product (upto 24.2X, average of 12.6X), while speeding up the design process significantly.