A portable global optimizer and linker
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Improving code density using compression techniques
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor
Digital Technical Journal
Computer organization and design (2nd ed.): the hardware/software interface
Computer organization and design (2nd ed.): the hardware/software interface
Code compression based on operand factorization
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Enhanced code compression for embedded RISC processors
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Compiler techniques for code compaction
ACM Transactions on Programming Languages and Systems (TOPLAS)
Heads and tails: a variable-length instruction format supporting parallel fetch and decode
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Analyzing and compressing assembly code
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
DSP Processors Hit the Mainstream
Computer
A DISE implementation of dynamic code decompression
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Enhancing the performance of 16-bit code using augmenting instructions
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Tiny instruction caches for low power embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Reducing code size with echo instructions
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Reducing Instruction Fetch Cost by Packing Instructions into RegisterWindows
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
High-quality ISA synthesis for low-power cache designs in embedded microprocessors
IBM Journal of Research and Development
Adapting compilation techniques to enhance the packing of instructions into registers
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Addressing instruction fetch bottlenecks by using an instruction register file
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Proceedings of the conference on Design, automation and test in Europe
Energy-efficient register caching with compiler assistance
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture
Implementing dynamic implied addressing mode for multi-output instructions
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Fine-grain dynamic instruction placement for L0 scratch-pad memory
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches
ACM Transactions on Architecture and Code Optimization (TACO)
An exploration of mechanisms for dynamic cryptographic instruction set extension
CHES'11 Proceedings of the 13th international conference on Cryptographic hardware and embedded systems
Reducing instruction bit-width for low-power VLIW architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hi-index | 0.00 |
New processors, both embedded and general purpose, often have conflicting design requirements involving space, power, and performance. Architectural features and compiler optimizations often target one or more design goals at the expense of the others. This paper presents a novel architectural and compiler approach to simultaneously reduce power requirements, decrease code size, and improve performance by integrating an instruction register file (IRF) into the architecture. Frequently occurring instructions are placed in the IRF. Multiple entries in the IRF can be referenced by a single packed instruction in ROM or L1 instruction cache. Unlike conventional code compression, our approach allows the frequent instructions to be referenced in arbitrary combinations. The experimental results show significant improvements in space and power, as well as some improvement in execution time when using only 32 entries. These advantages make packing instructions into registers an effective approach for improving overall efficiency.