Improving Program Efficiency by Packing Instructions into Registers

Authors:
Stephen Hines;Joshua Green;Gary Tyson;David Whalley
Affiliations:
Florida State University;Florida State University;Florida State University;Florida State University
Venue:
Proceedings of the 32nd annual international symposium on Computer Architecture
Year:
2005

Citing 20
Cited 12

A portable global optimizer and linker

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
The SPARC architecture manual (version 9)

The SPARC architecture manual (version 9)
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Improving code density using compression techniques

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor

Digital Technical Journal
Computer organization and design (2nd ed.): the hardware/software interface

Computer organization and design (2nd ed.): the hardware/software interface
Code compression based on operand factorization

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Enhanced code compression for embedded RISC processors

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Compiler techniques for code compaction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Heads and tails: a variable-length instruction format supporting parallel fetch and decode

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Analyzing and compressing assembly code

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
DSP Processors Hit the Mainstream

Computer
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Embedded Control Problems, Thumb, and the ARM7TDMI

IEEE Micro
A DISE implementation of dynamic code decompression

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Enhancing the performance of 16-bit code using augmenting instructions

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Tiny instruction caches for low power embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Reducing code size with echo instructions

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop

Reducing Instruction Fetch Cost by Packing Instructions into RegisterWindows

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
High-quality ISA synthesis for low-power cache designs in embedded microprocessors

IBM Journal of Research and Development
Adapting compilation techniques to enhance the packing of instructions into registers

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Addressing instruction fetch bottlenecks by using an instruction register file

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Harnessing horizontal parallelism and vertical instruction packing of programs to improve system overall efficiency

Proceedings of the conference on Design, automation and test in Europe
Energy-efficient register caching with compiler assistance

ACM Transactions on Architecture and Code Optimization (TACO)
Program differentiation

Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture
Implementing dynamic implied addressing mode for multi-output instructions

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Fine-grain dynamic instruction placement for L0 scratch-pad memory

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches

ACM Transactions on Architecture and Code Optimization (TACO)
An exploration of mechanisms for dynamic cryptographic instruction set extension

CHES'11 Proceedings of the 13th international conference on Cryptographic hardware and embedded systems
Reducing instruction bit-width for low-power VLIW architectures

ACM Transactions on Design Automation of Electronic Systems (TODAES)

Quantified Score

Hi-index	0.00

Visualization

Abstract

New processors, both embedded and general purpose, often have conflicting design requirements involving space, power, and performance. Architectural features and compiler optimizations often target one or more design goals at the expense of the others. This paper presents a novel architectural and compiler approach to simultaneously reduce power requirements, decrease code size, and improve performance by integrating an instruction register file (IRF) into the architecture. Frequently occurring instructions are placed in the IRF. Multiple entries in the IRF can be referenced by a single packed instruction in ROM or L1 instruction cache. Unlike conventional code compression, our approach allows the frequent instructions to be referenced in arbitrary combinations. The experimental results show significant improvements in space and power, as well as some improvement in execution time when using only 32 entries. These advantages make packing instructions into registers an effective approach for improving overall efficiency.