Enabling large decoded instruction loop caching for energy-aware embedded processors

Authors:
Ji Gu;Hui Guo
Affiliations:
The University of New South Wales, Sydney, Australia;The University of New South Wales, Sydney, Australia
Venue:
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Year:
2010

Citing 27
Cited 0

Reducing the frequency of tag compares for low power I-cache design

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Cache design trade-offs for power and performance optimization: a case study

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Instruction buffering to reduce power in processors for signal processing

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Architectural and compiler techniques for energy reduction in high-performance microprocessors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low-power electronics and design
A low power unified cache architecture providing power and performance flexibility (poster session)

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Using dynamic cache management techniques to reduce energy in general purpose processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on system-level interconnect prediction
Micro-operation cache: a power aware frontend for the variable instruction length ISA

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Power-aware partitioned cache architectures

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
PEAS-III: An ASIP Design Environment

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Power Savings in Embedded Processors through Decode Filer Cache

Proceedings of the conference on Design, automation and test in Europe
Frequent loop detection using efficient non-intrusive on-chip hardware

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Design and analysis of low-power cache using two-level filter scheme

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Partial Tag Comparison: A New Technology for Power-Efficient Set-Associative Cache Designs

VLSID '04 Proceedings of the 17th International Conference on VLSI Design
A Content Aware Integer Register File Organization

Proceedings of the 31st annual international symposium on Computer architecture
Instruction buffering exploration for low energy VLIWs with instruction clusters

Proceedings of the 2004 Asia and South Pacific Design Automation Conference
A way-halting cache for low-energy high-performance systems

ACM Transactions on Architecture and Code Optimization (TACO)
Lazy BTB: reduce BTB energy consumption using dynamic profiling

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Register file caching for energy efficiency

Proceedings of the 2006 international symposium on Low power electronics and design
Reducing branch predictor leakage energy by exploiting loops

ACM Transactions on Embedded Computing Systems (TECS) - SPECIAL ISSUE SCOPES 2005
Customization of Register File Banking Architecture for Low Power

VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
BTB Access Filtering: A Low Energy and High Performance Design

ISVLSI '08 Proceedings of the 2008 IEEE Computer Society Annual Symposium on VLSI
Efficient Embedded Computing

Computer
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers

Microprocessors & Microsystems
Reducing power consumption of embedded processors through register file partitioning and compiler support

ASAP '08 Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors

Quantified Score

Hi-index	0.00

Visualization

Abstract

Low energy consumption in embedded processors is increasingly important in step with the system complexity. The on-chip instruction cache (I-cache) is usually a most energy consuming component on the processor chip due to its large size and frequent access operations. To reduce such energy consumption, the existing loop cache approaches use a tiny decoded cache to filter the I-cache access and instruction decode activity for repeated loop iterations. However, such designs are effective to small and simple loops, and only suitable for DSP kernel-like applications. They are not effectual to many embedded applications where complex loops are common. In this paper, we propose a decoded loop instruction cache (DLIC) that is small, hence energy efficient, yet can capture most loops, including large, nested ones with branch executions, so that a significant amount of I-cache accesses and instruction decoding can be eradicated. Experiments on a set of embedded benchmarks show that our proposed DLIC scheme can reduce energy consumption by up to 87%. On average, 66% energy can be saved on instruction fetching and decoding, at a performance overhead of only 1.4%.