A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Text compression
Executing compressed programs on an embedded RISC architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Improving code density using compression techniques
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Enhanced code compression for embedded RISC processors
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Selective instruction compression for memory energy reduction in embedded systems
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Compiler-driven cached code compression schemes for embedded ILP processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Code size minimization and retargetable assembly for custom EPIC and VLIW instruction formats
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Profile-guided code compression
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Code compression for VLIW processors using variable-to-fixed coding
Proceedings of the 15th international symposium on System Synthesis
A Fast Asynchronous Huffman Decoder for Compressed-Code Embedded Processors
ASYNC '98 Proceedings of the 4th International Symposium on Advanced Research in Asynchronous Circuits and Systems
Code density optimization for embedded DSP processors using data compression techniques
ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
A Simple and Fast Scheme for Code Compression for VLIW Processors
DCC '03 Proceedings of the Conference on Data Compression
Code Compression Using Variable-to-fixed Coding Based on Arithmetic Coding
DCC '03 Proceedings of the Conference on Data Compression
Cold code decompression at runtime
Communications of the ACM - Program compaction
Compact Binaries with Code Compression in a Software Dynamic Translator
Proceedings of the conference on Design, automation and test in Europe - Volume 2
LZW-Based Code Compression for VLIW Embedded Systems
Proceedings of the conference on Design, automation and test in Europe - Volume 3
IEEE Transactions on Computers
Multi-profile based code compression
Proceedings of the 41st annual Design Automation Conference
Profile-Driven Selective Code Compression
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A hamming distance based VLIW/EPIC code compression technique
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Access Pattern-Based Code Compression for Memory-Constrained Embedded Systems
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
A post-compilation register reassignment technique for improving hamming distance code compression
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
SAMC: a code compression algorithm for embedded processors
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
We propose a new class of methods for VLIW code compression using variable-sized branch blocks with self-generating tables. Code compression traditionally works on fixed-sized blocks with its efficiency limited by their small size. A branch block, a series of instructions between two consecutive possible branch targets, provides larger blocks for code compression. We compare three methods for compressing branch blocks: table-based, Lempel-Ziv-Welch (LZW)-based and selective code compression. Our approaches are fully adaptive and generate the coding table on-the-fly during compression and decompression. When encountering a branch target, the coding table is cleared to ensure correctness. Decompression requires a simple table lookup and updates the coding table when necessary. When decoding sequentially, the table-based method produces 4 bytes per iteration while the LZW-based methods provide 8 bytes peak and 1.82 bytes average decompression bandwidth. Compared to Huffman's 1 byte and variable-to-fixed (V2F)'s 13-bit peak performance, our methods have higher decoding bandwidth and a comparable compression ratio. Parallel decompression could also be applied to our methods, which is more suitable for VLIW architectures.