Generating instruction sets and microarchitectures from applications
ICCAD '94 Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design
A tool for processor instruction set design
EURO-DAC '94 Proceedings of the conference on European design automation
Instruction fetch mechanisms for VLIW architectures with compressed encodings
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Instruction selection for embedded DSPs with complex instructions
EURO-DAC '96/EURO-VHDL '96 Proceedings of the conference on European design automation
Instruction selection, resource allocation, and scheduling in the AVIV retargetable code generator
DAC '98 Proceedings of the 35th annual Design Automation Conference
Advanced compiler design and implementation
Advanced compiler design and implementation
Synthesis of Application Specific Instructions for Embedded DSP Software
IEEE Transactions on Computers
EXPRESSION: a language for architecture exploration through compiler/simulator retargetability
DATE '99 Proceedings of the conference on Design, automation and test in Europe
Compression of Embedded System Programs
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
Synthesis of custom processors based on extensible platforms
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Efficient instruction encoding for automatic instruction set design of configurable ASIPs
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Code Size Efficiency in Global Scheduling for ILP Processors
INTERACT '02 Proceedings of the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures
Automatic generation of application specific processors
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Code density optimization for embedded DSP processors using data compression techniques
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
The Instruction-Set Extension Problem: A Survey
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 14.98 |
Existing trend of processors shows a progress toward customizable and reconfigurable architectures. In this paper, we study the benefit of combining the architectural design of a VLIW DSP and the concepts of modern customizable processors like ASIPs (Application Specific Instruction Set Processors) for code size reduction. VLIW DSP architectures exhibit heterogeneous connections between functional units and register files for speeding up special tasks. Such architectural characteristics can be effectively exploited through the use of complex instruction set extensions (ISEs). Although VLIWs are increasingly being used for DSP applications to achieve very high performance, such architectures are known to suffer from increased code size. This paper also addresses how to generate and use ISEs that can result in significant code size reduction in VLIW DSPs without degrading performance. Unfortunately, contemporary techniques for generation of ISEs when applied before resource-binding fail to generate legal ISEs for VLIW architectures with heterogeneous connectivity between the functional units and register files. We propose a Heuristic-based approach to generate ISEs for a generalized heterogeneous-connecitivity-based VLIW DSP architecture. We achieve an average code size reduction of 25 percent on the MiBench suite with no penalty in performance by applying our ISE generation algorithms on the TI TMS320C6xx, a representative VLIW DSP. We also show that the overhead of the required architectural assists for our approach is minimal: The TMS320C6xx pipeline meets the required timing with only a limited overhead in area.