Tiny instruction caches for low power embedded systems

Authors:
Ann Gordon-Ross;Susan Cotterell;Frank Vahid
Affiliations:
University of California, Riverside, CA;University of California, Riverside, CA;University of California, Riverside, CA
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2003

Citing 17
Cited 10

Bus-invert coding for low-power I/O

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Instruction buffering to reduce power in processors for signal processing

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Iterative cache simulation of embedded CPUs with trace stripping

CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Instruction fetch energy reduction using loop caches for embedded applications with small tight loops

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
A power reduction technique with object code merging for application specific embedded processors

DATE '00 Proceedings of the conference on Design, automation and test in Europe
A low power unified cache architecture providing power and performance flexibility (poster session)

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Address bus encoding techniques for system-level power optimization

Proceedings of the conference on Design, automation and test in Europe
Irredundant address bus encoding for low power

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Enhancing loop buffering of media and telecommunications applications using low-overhead predication

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Area and power reduction of embedded DSP systems using instruction compression and re-configurable encoding

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
SH3: High Code Density, Low Power

IEEE Micro
Synthesis of customized loop caches for core-based embedded systems

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Energy and Performance Improvements in Microprocessor Design Using a Loop Cache

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design

Improving Program Efficiency by Packing Instructions into Registers

Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing Instruction Fetch Cost by Packing Instructions into RegisterWindows

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Adapting compilation techniques to enhance the packing of instructions into registers

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
A cache design for high performance embedded systems

Journal of Embedded Computing - Cache exploitation in embedded systems
Addressing instruction fetch bottlenecks by using an instruction register file

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
A low power front-end for embedded processors using a block-aware instruction set

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Variable-sized object packing and its applications to instruction cache design

Computers and Electrical Engineering
HitME: low power Hit MEmory buffer for embedded systems

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Compiler Support for Code Size Reduction Using a Queue-Based Processor

Transactions on High-Performance Embedded Architectures and Compilers II
Fine-grain dynamic instruction placement for L0 scratch-pad memory

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Instruction caches have traditionally been used to improve software performance. Recently, several tiny instruction cache designs, including filter caches and dynamic loop caches, have been proposed to instead reduce software power. We propose several new tiny instruction cache designs, including preloaded loop caches, and one-level and two-level hybrid dynamic/preloaded loop caches. We evaluate the existing and proposed designs on embedded system software benchmarks from both the Powerstone and MediaBench suites, on two different processor architectures, for a variety of different technologies. We show on average that filter caching achieves the best instruction fetch energy reductions of 60--80%, but at the cost of about 20% performance degradation, which could also affect overall energy savings. We show that dynamic loop caching gives good instruction fetch energy savings of about 30%, but that if a designer is able to profile a program, preloaded loop caching can more than double the savings. We describe automated methods for quickly determining the best loop cache configuration, methods useful in a core-based design flow.