Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction fetch mechanisms for VLIW architectures with compressed encodings
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Improving code density using compression techniques
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Code generation algorithms for digital signal processors
Code generation algorithms for digital signal processors
Performance evaluation for a compressed-VLIW processor
Proceedings of the 2002 ACM symposium on Applied computing
Profile guided selection of ARM and thumb instructions
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Using Complete Machine Simulation for Software Power Estimation: The SoftWatt Approach
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Energy efficient code generation exploiting reduced bit-width instruction set architectures (rISA)
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Improving Program Efficiency by Packing Instructions into Registers
Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing code size in VLIW instruction scheduling
Journal of Embedded Computing - Low-power Embedded Systems
Proceedings of the conference on Design, automation and test in Europe
Computer Organization and Design: The Hardware/Software Interface
Computer Organization and Design: The Hardware/Software Interface
Fast Code Generation for Embedded Processors with Aliased Heterogeneous Registers
Transactions on High-Performance Embedded Architectures and Compilers II
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Code compression for embedded VLIW processors using variable-to-fixed coding
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
VLIW (very long instruction word) architectures have proven to be useful for embedded applications with abundant instruction level parallelism. But due to the long instruction bus width it often consumes more power and memory space than necessary. One way to lessen this problem is to adopt a reduced bit-width instruction set architecture (ISA) that has a narrower instruction word length. This facilitates a more efficient hardware implementation in terms of area and power by decreasing bus-bandwidth requirements and the power dissipation associated with instruction fetches. In practice, however, it is impossible to convert a given ISA fully into an equivalent reduced bit-width one because the narrow instruction word, due to bit-width restrictions, can encode only a small subset of normal instructions in the original ISA. Consequently, existing processors provide narrow instructions in very limited cases along with severe restrictions on register accessibility. The objective of this work is to explore the possibility of complete conversion, as a case study, of an existing 32-bit VLIW ISA into a 16-bit one that supports effectively all 32-bit instructions. To this objective, we attempt to circumvent the bit-width restrictions by dynamically extending the effective instruction word length of the converted 16-bit operations. Further, we will show that our proposed ISA conversion can create a synergy effect with a VLES (variable length execution set) architecture that is adopted in most recent VLIW processors. According to our experiment, the code size becomes significantly smaller after the conversion to 16-bit VLIW code. Also at a slight run time cost, the machine with the 16-bit ISA consumes much less energy than the original machine.