The design of an asynchronous microprocessor
Proceedings of the decennial Caltech conference on VLSI on Advanced research in VLSI
Executing compressed programs on an embedded RISC architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The Design and Evaluation of an Asynchronous Microprocessor
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
Compression of Embedded System Programs
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
A Low-power Asynchronous Data-path for a FIR Filter Bank
ASYNC '96 Proceedings of the 2nd International Symposium on Advanced Research in Asynchronous Circuits and Systems
ASYNC '97 Proceedings of the 3rd International Symposium on Advanced Research in Asynchronous Circuits and Systems
Speculative Completion for the Design of High-Performance Asynchronous Dynamic Adders
ASYNC '97 Proceedings of the 3rd International Symposium on Advanced Research in Asynchronous Circuits and Systems
Designing Asynchronous Standby Circuits for a Low-Power Pager
ASYNC '97 Proceedings of the 3rd International Symposium on Advanced Research in Asynchronous Circuits and Systems
AMULET2e: An Asynchronous Embedded Controller
ASYNC '97 Proceedings of the 3rd International Symposium on Advanced Research in Asynchronous Circuits and Systems
High speed pattern matching for a fast Huffman decoder
IEEE Transactions on Consumer Electronics
A parallel decoder of programmable Huffman codes
IEEE Transactions on Circuits and Systems for Video Technology
Procedure based program compression
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Compiler-driven cached code compression schemes for embedded ILP processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Evaluation of a high performance code compression method
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Procedure Based Program Compression
International Journal of Parallel Programming - Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II
Survey of code-size reduction methods
ACM Computing Surveys (CSUR)
Compressing MIPS code by multiple operand dependencies
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
This paper describes the architecture and implementation of a high-speed decompression engine for embedded processors. The engine is targeted to processors where embedded programs are stored in compressed form, and decompressed at runtime during instruction cache refill. The decompression engine uses a unique asynchronous variable decompression rate architecture to process Huffman-encoded instructions. The resulting circuit is significantly smaller than comparable synchronous decoders, yet has a higher throughput rate than almost all existing designs. The 0.8 micron layout is all full-custom and contains predominantly dynamic domino logic. The top-level control, as well as several small state machines, are implemented using asynchronous logic. The design operates without a user-supplied clock. Simulations using Lsim show average throughput of 32 bits/45 ns on the output side, corresponding to about 480 Mbit/sec on the input side. The chip has been manufactured by MOSIS; tests show that the asynchronous implementation operates correctly, with an average throughput exceeding simulations: 32 bits/39 ns on the output side, corresponding to about 560 Mbit/sec on the input side. This speed is acceptable for our application. The area of the design (excluding the pad-frame overhead) is only 0.75~\hbox{mm}^2. The design is the first fabricated chip for an instruction decompression unit for embedded processors.