Fast enumeration of maximal valid subgraphs for custom-instruction identification
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Fast, nearly optimal ISE identification with I/O serialization through maximal clique enumeration
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A polynomial-time custom instruction identification algorithm based on dynamic programming
Proceedings of the 16th Asia and South Pacific Design Automation Conference
The Instruction-Set Extension Problem: A Survey
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Scientific Application Demands on a Reconfigurable Functional Unit Interface
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Practical and effective domain-specific function unit design for CGRA
ICCSA'11 Proceedings of the 2011 international conference on Computational science and Its applications - Volume Part V
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Complexity of computing convex subgraphs in custom instruction synthesis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Loop acceleration exploration for ASIP architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Considering the effect of process variations during the ISA extension design flow
Microprocessors & Microsystems
Accelerating an application domain with specialized functional units
ACM Transactions on Architecture and Code Optimization (TACO)
A just-in-time customizable processor
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Automatic generation of custom instruction processors from high-level application descriptions enables fast design space exploration, while offering very favorable performance and silicon area combinations. This work introduces a novel method for adapting the instruction set to match an application captured in a high-level language. A simplified model is used to find the optimal instructions via enumeration of maximal convex subgraphs of application data flow graphs (DFGs). Our experiments involving a set of multimedia and cryptography benchmarks show that an order of magnitude performance improvement can be achieved using only a limited amount of hardware resources. In most cases, our algorithm takes less than a second to execute.