A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
MIPS RISC architecture
Executing compressed programs on an embedded RISC architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Code optimization techniques for embedded DSP microprocessors
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Improving code density using compression techniques
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Code compression for embedded systems
DAC '98 Proceedings of the 35th annual Design Automation Conference
Code compression based on operand factorization
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Selective instruction compression for memory energy reduction in embedded systems
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
The Superscalar Architecture of the MC68060
IEEE Micro
High-Performance RISC Microprocessors
IEEE Micro
A decompression core for powerPC
IBM Journal of Research and Development
A unified processor architecture for RISC & VLIW DSP
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Improving Program Efficiency by Packing Instructions into Registers
Proceedings of the 32nd annual international symposium on Computer Architecture
Performance evaluation of ring-structure register file in multimedia applications
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Adapting compilation techniques to enhance the packing of instructions into registers
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Journal of Signal Processing Systems
Hi-index | 0.00 |
Existing variable-length instruction formats provide higher code densities than fixed-length formats, but are ill-suited to pipelined or parallel instruction fetch and decode. This paper presents a new variable-length instruction format that supports parallel fetch and decode of multiple instructions per cycle, allowing both high code density and rapid execution for high-performance embedded processors. In contrast to earlier schemes that store compressed variable-length instructions in main memory then expand them into fixed-length in-cache formats, the new format is suitable for direct execution from the instruction cache, thereby increasing effective cache capacity and reducing cache power. The new head-and-tails (HAT) format splits each instruction into a fixed-length head and a variable-length tail, and packs heads and tails in separate sections within a larger fixed-length instruction bundle. The heads can be easily fetched and decoded in parallel as they are a fixed distance apart in the instruction stream, while the variable-length tails provide improved code density. A conventional MIPS RISC instruction set is re-encoded in a variable-length HAT scheme, and achieves an average static code compression ratio of 75% and a dynamic fetch ratio (new-bits-fetched/old-bits-fetched) of 75%.