Code generation using tree matching and dynamic programming
ACM Transactions on Programming Languages and Systems (TOPLAS)
A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Instruction selection using binate covering for code size optimization
ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
Instruction selection for embedded DSPs with complex instructions
EURO-DAC '96/EURO-VHDL '96 Proceedings of the conference on European design automation
Instruction set definition and instruction selection for ASIPs
ISSS '94 Proceedings of the 7th international symposium on High-level synthesis
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Instruction set selection for ASIP design
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Synthesis of Application Specific Instructions for Embedded DSP Software
IEEE Transactions on Computers
Exact and Approximate Algorithms for Scheduling Nonidentical Processors
Journal of the ACM (JACM)
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
CryptoManiac: a fast flexible architecture for secure communication
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ARM Architecture Reference Manual
ARM Architecture Reference Manual
Instruction generation and regularity extraction for reconfigurable processors
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
NetBench: a benchmarking suite for network processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Compiler Optimizations for Adaptive EPIC Processors
EMSOFT '01 Proceedings of the First International Workshop on Embedded Software
Synthesis of custom processors based on extensible platforms
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Proceedings of the tenth international symposium on Hardware/software codesign
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Design of instruction set architectures for support of high-level languages
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Performance Evaluation of the VF Graph Matching Algorithm
ICIAP '99 Proceedings of the 10th International Conference on Image Analysis and Processing
A dynamic instruction set computer
FCCM '95 Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines
Automatic design of computer instruction sets
Automatic design of computer instruction sets
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Synthesis of application specific instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Scalable subgraph mapping for acyclic computation accelerators
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Code transformation strategies for extensible embedded processors
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Efficient ASIP design for configurable processors with fine-grained resource sharing
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Proceedings of the 45th annual Design Automation Conference
Instruction set extension exploration in multiple-issue architecture
Proceedings of the conference on Design, automation and test in Europe
Speculative DMA for architecturally visible storage in instruction set extensions
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Rapid design of area-efficient custom instructions for reconfigurable embedded processing
Journal of Systems Architecture: the EUROMICRO Journal
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Journal of Systems Architecture: the EUROMICRO Journal
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Proceedings of the 2009 International Conference on Computer-Aided Design
Instruction set extension generation with considering physical constraints
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Understanding sources of inefficiency in general-purpose chips
Proceedings of the 37th annual international symposium on Computer architecture
Co-synthesis of FPGA-based application-specific floating point simd accelerators
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
The Instruction-Set Extension Problem: A Survey
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Understanding sources of ineffciency in general-purpose chips
Communications of the ACM
Practical and effective domain-specific function unit design for CGRA
ICCSA'11 Proceedings of the 2011 international conference on Computational science and Its applications - Volume Part V
Accelerating loops for coarse grained reconfigurable architectures using instruction extensions
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Architecture-aware custom instruction generation for reconfigurable processors
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
Instruction set architectural guidelines for embedded packet-processing engines
Journal of Systems Architecture: the EUROMICRO Journal
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Considering the effect of process variations during the ISA extension design flow
Microprocessors & Microsystems
Hardware reuse in modern application-specific processors and accelerators
Microprocessors & Microsystems
Accelerating an application domain with specialized functional units
ACM Transactions on Architecture and Code Optimization (TACO)
Rapid evaluation of custom instruction selection approaches with FPGA estimation
ACM Transactions on Embedded Computing Systems (TECS)
Extended Instruction Exploration for Multiple-Issue Architectures
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 15.00 |
Application-specific extensions to the computational capabilities of a processor provide an efficient mechanism to meet the growing performance and power demands of embedded applications. Hardware, in the form of new function units (or coprocessors), and the corresponding instructions are added to a baseline processor to meet the critical computational demands of a target application. In this paper, the design of a system to automate the instruction set customization process is presented. A dataflow graph design space exploration engine efficiently identifies computation subgraphs to create custom hardware and a compiler subgraph matching framework seamlessly exploits this hardware. We demonstrate the effectiveness of this system across a range of application domains and study the applicability of the custom hardware across an entire application domain. Generalization techniques are presented which enable the application-specific hardware to be more effectively used across a domain.