Reducing instruction bit-width for low-power VLIW architectures

Authors:
Jongwon Lee;Jonghee M. Youn;Doosan Cho;Yunheung Paek
Affiliations:
Seoul National University, Seoul, Korea;Seoul National University, Seoul, Korea;Sunchon National University, Korea;Seoul National University, Seoul, Korea
Venue:
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Year:
2013

Citing 16
Cited 0

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction fetch mechanisms for VLIW architectures with compressed encodings

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Improving code density using compression techniques

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Code generation algorithms for digital signal processors

Code generation algorithms for digital signal processors
Performance evaluation for a compressed-VLIW processor

Proceedings of the 2002 ACM symposium on Applied computing
Profile guided selection of ARM and thumb instructions

Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Introducing the FR500 Embedded Microprocessor

IEEE Micro
Using Complete Machine Simulation for Software Power Estimation: The SoftWatt Approach

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Energy efficient code generation exploiting reduced bit-width instruction set architectures (rISA)

Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Improving Program Efficiency by Packing Instructions into Registers

Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing code size in VLIW instruction scheduling

Journal of Embedded Computing - Low-power Embedded Systems
Harnessing horizontal parallelism and vertical instruction packing of programs to improve system overall efficiency

Proceedings of the conference on Design, automation and test in Europe
Computer Organization and Design: The Hardware/Software Interface

Computer Organization and Design: The Hardware/Software Interface
Fast Code Generation for Embedded Processors with Aliased Heterogeneous Registers

Transactions on High-Performance Embedded Architectures and Compilers II
Two-Level Dictionary Code Compression: A New Scheme to Improve Instruction Code Density of Embedded Applications

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Code compression for embedded VLIW processors using variable-to-fixed coding

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

VLIW (very long instruction word) architectures have proven to be useful for embedded applications with abundant instruction level parallelism. But due to the long instruction bus width it often consumes more power and memory space than necessary. One way to lessen this problem is to adopt a reduced bit-width instruction set architecture (ISA) that has a narrower instruction word length. This facilitates a more efficient hardware implementation in terms of area and power by decreasing bus-bandwidth requirements and the power dissipation associated with instruction fetches. In practice, however, it is impossible to convert a given ISA fully into an equivalent reduced bit-width one because the narrow instruction word, due to bit-width restrictions, can encode only a small subset of normal instructions in the original ISA. Consequently, existing processors provide narrow instructions in very limited cases along with severe restrictions on register accessibility. The objective of this work is to explore the possibility of complete conversion, as a case study, of an existing 32-bit VLIW ISA into a 16-bit one that supports effectively all 32-bit instructions. To this objective, we attempt to circumvent the bit-width restrictions by dynamically extending the effective instruction word length of the converted 16-bit operations. Further, we will show that our proposed ISA conversion can create a synergy effect with a VLES (variable length execution set) architecture that is adopted in most recent VLIW processors. According to our experiment, the code size becomes significantly smaller after the conversion to 16-bit VLIW code. Also at a slight run time cost, the machine with the 16-bit ISA consumes much less energy than the original machine.