Reducing Instruction Fetch Cost by Packing Instructions into RegisterWindows

Authors:
Stephen Hines;Gary Tyson;David Whalley
Affiliations:
Florida State University;Florida State University;Florida State University
Venue:
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Year:
2005

Citing 20
Cited 3

A portable global optimizer and linker

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Efficient superscalar performance through boosting

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Improving code density using compression techniques

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor

Digital Technical Journal
Enhanced code compression for embedded RISC processors

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Instruction fetch energy reduction using loop caches for embedded applications with small tight loops

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Compiler techniques for code compaction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Analyzing and compressing assembly code

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
DSP Processors Hit the Mainstream

Computer
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Embedded Control Problems, Thumb, and the ARM7TDMI

IEEE Micro
Drowsy instruction caches: leakage power reduction using dynamic voltage scaling and cache sub-bank prediction

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A DISE implementation of dynamic code decompression

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Tiny instruction caches for low power embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Reducing code size with echo instructions

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
FITS: framework-based instruction-set tuning synthesis for embedded application specific processors

Proceedings of the 41st annual Design Automation Conference
Improving Program Efficiency by Packing Instructions into Registers

Proceedings of the 32nd annual international symposium on Computer Architecture
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop

Adapting compilation techniques to enhance the packing of instructions into registers

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Addressing instruction fetch bottlenecks by using an instruction register file

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Fine-grain dynamic instruction placement for L0 scratch-pad memory

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Instruction packing is a combination compiler/ architectural approach that allows for decreased code size, reduced power consumption and improved performance. The packing is obtained by placing frequently occurring instructions into an Instruction Register File (IRF). Multiple IRF entries can then be accessed using special packed instructions. Previous IRF efforts focused on using a single 32-entry register file for the duration of an application. This paper presents software and hardware extensions to the IRF supporting multiple instruction register windows to allow a greater number of relevant instructions to be available for packing in each function. Windows are shared among similar functions to reduce the overall costs involved in such an approach. The results indicate that significant improvements in instruction fetch cost can be obtained by using this simple architectural enhancement. We also show that using an IRF with a loop cache, which is also used to reduce energy consumption, results in much less energy consumption than using either feature in isolation.