Executing compressed programs on an embedded RISC architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Evaluation of a high performance code compression method
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Design of an one-cycle decompression hardware for performance increase in embedded systems
Proceedings of the 39th annual Design Automation Conference
Universal Compression and Retrieval
Universal Compression and Retrieval
A code decompression architecture for VLIW processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A Simple and Fast Scheme for Code Compression for VLIW Processors
DCC '03 Proceedings of the Conference on Data Compression
DISE: a programmable macro engine for customizing applications
Proceedings of the 30th annual international symposium on Computer architecture
Reducing code size with echo instructions
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
A decompression core for powerPC
IBM Journal of Research and Development
A hamming distance based VLIW/EPIC code compression technique
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
A cycle-accurate compilation algorithm for custom pipelined datapaths
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Using minimal minterms to represent programmability
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A Graph Based Algorithm for Data Path Optimization in Custom Processors
DSD '06 Proceedings of the 9th EUROMICRO Conference on Digital System Design
Generic netlist representation for system and PE level design exploration
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Proceedings of the conference on Design, automation and test in Europe
No-instruction-set-computer (nisc) technology modeling and compilation
No-instruction-set-computer (nisc) technology modeling and compilation
C-based design flow: a case study on G.729A for voice over internet protocol (VoIP)
Proceedings of the 45th annual Design Automation Conference
Automatic architecture refinement techniques for customizing processing elements
Proceedings of the 45th annual Design Automation Conference
Synthesis and optimization of low-power custom nisc processors
Synthesis and optimization of low-power custom nisc processors
A universal placement technique of compressed instructions for efficient parallel decompression
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Bitmask aware compression of NISC control words
Integration, the VLSI Journal
Hi-index | 0.00 |
Horizontal Microcoded Architecture (HMA) is a paradigm for designing programmable high-performance processing elements (PEs). However, it suffers from large code size, which can be addressed by compression. In this article, we study the code size of one of the new HMA-based technologies called No-Instruction-Set Computer (NISC). We show that NISC code size can be several times larger than a typical RISC processor, and we propose several low-overhead dictionary-based code compression techniques to reduce its code size. Our compression algorithm leverages the knowledge of “don't care” values in the control words and can reduce the code size by 3.3 times, on average. Despite such good results, as shown in this article, these compression techniques lead to poor FPGA implementations because they require many on-chip RAMs. To address this issue, we introduce an FPGA-aware dictionary-based technique that uses the dual-port feature of on-chip RAMs to reduce the number of utilized block RAMs by half. Additionally, we propose cascading two-levels of dictionaries for code size and block RAM reduction of large programs. For an MP3 application, a merged, cascaded, three-dictionary implementation reduces the number of utilized block RAMs by 4.3 times (76%) compared to a NISC without compression. This corresponds to 20% additional savings over the best single level dictionary-based compression.