Circuit and microarchitectural techniques for reducing cache leakage power

Authors:
Nam Sung Kim;Krisztian Flautner;David Blaauw;Trevor Mudge
Affiliations:
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI;ARM, Ltd., Cambridge CB1 9NJ, U.K.;Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI;Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2004

Citing 21
Cited 41

Digital integrated circuits: a design perspective

Digital integrated circuits: a design perspective
Pipeline gating: speculation control for energy reduction

Proceedings of the 25th annual international symposium on Computer architecture
The simulation and evaluation of dynamic voltage scaling algorithms

ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
A low power SRAM using auto-backgate-controlled MT-CMOS

ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Cache decay: exploiting generational behavior to reduce cache leakage power

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Automatic performance setting for dynamic voltage scaling

Proceedings of the 7th annual international conference on Mobile computing and networking
Scaling of stack effect and its application for leakage reduction

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Analysis of dual-Vt SRAM cells with full-swing single-ended bit line sensing for on-chip cache

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamic fine-grain leakage reduction using leakage-biased bitlines

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Low-leakage asymmetric-cell SRAM

Proceedings of the 2002 international symposium on Low power electronics and design
Managing leakage for transient data: decay and quasi-static 4T memory cells

Proceedings of the 2002 international symposium on Low power electronics and design
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Intrinsic Leakage in Low-Power Deep Submicron CMOS ICs

Proceedings of the IEEE International Test Conference
Standby power optimization via transistor sizing and dual threshold voltage assignment

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Compiler-directed instruction cache leakage optimization

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Drowsy instruction caches: leakage power reduction using dynamic voltage scaling and cache sub-bank prediction

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I-Caches

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture

Single-vDD and single-vT super-drowsy techniques for low-leakage high-performance instruction caches

Proceedings of the 2004 international symposium on Low power electronics and design
Compiler Managed Dynamic Instruction Placement in a Low-Power Code Cache

Proceedings of the international symposium on Code generation and optimization
Drowsy region-based caches: minimizing both dynamic and static power dissipation

Proceedings of the 2nd conference on Computing frontiers
Snug set-associative caches: reducing leakage power while improving performance

ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Exploring the limits of leakage power reduction in caches

ACM Transactions on Architecture and Code Optimization (TACO)
Temperature-Dependent Optimization of Cache Leakage Power Dissipation

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Leakage power reduction of embedded memories on FPGAs through location assignment

Proceedings of the 43rd annual Design Automation Conference
Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties

ACM Transactions on Architecture and Code Optimization (TACO)
Compiler-managed partitioned data caches for low power

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Locality-driven architectural cache sub-banking for leakage energy reduction

ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
On-Demand Solution to Minimize I-Cache Leakage Energy with Maintaining Performance

IEEE Transactions on Computers
A low leakage 9t sram cell for ultra-low power operation

Proceedings of the 18th ACM Great Lakes symposium on VLSI
Variable-sized object packing and its applications to instruction cache design

Computers and Electrical Engineering
Optimal technology selection for minimizing energy and variability in low voltage applications

Proceedings of the 13th international symposium on Low power electronics and design
Improved Way Prediction Policy for Low-Energy Instruction Caches

ICESS '07 Proceedings of the 3rd international conference on Embedded Software and Systems
Specializing Cache Structures for High Performance and Energy Conservation in Embedded Systems

Transactions on High-Performance Embedded Architectures and Compilers I
Cache Controller Design on Ultra Low Leakage Embedded Processors

ARCS '09 Proceedings of the 22nd International Conference on Architecture of Computing Systems
Data Cache Techniques to Save Power and Deliver High Performance in Embedded Systems

Transactions on High-Performance Embedded Architectures and Compilers II
Leakage-aware task scheduling for partially dynamically reconfigurable FPGAs

ACM Transactions on Design Automation of Electronic Systems (TODAES)
System-on-Chip Test Architectures: Nanometer Design for Testability

System-on-Chip Test Architectures: Nanometer Design for Testability
Checkpoint allocation and release

ACM Transactions on Architecture and Code Optimization (TACO)
An efficient segmental bus-invert coding method for instruction memory data bus switching reduction

EURASIP Journal on Embedded Systems
Reducing leakage power with BTB access prediction

Integration, the VLSI Journal
Architecture level design space exploration of superscalar processor for multimedia applications

SPECTS'09 Proceedings of the 12th international conference on Symposium on Performance Evaluation of Computer & Telecommunication Systems
Leveraging high performance data cache techniques to save power in embedded systems

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Proposition for a sequential accelerator in future general-purpose manycore processors and the problem of migration-induced cache misses

Proceedings of the 7th ACM international conference on Computing frontiers
Low power branch prediction for embedded application processors

Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Single ended 6T SRAM with isolated read-port for low-power embedded systems

Proceedings of the Conference on Design, Automation and Test in Europe
Low-power multimedia system design by aggressive voltage scaling

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Location cache design and performance analysis for chip multiprocessors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Erratum

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Analytical soft error models accounting for die-to-die and within-die variations in sub-threshold SRAM cells

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Analysis of Resistive Open Defects in Drowsy SRAM Cells

Journal of Electronic Testing: Theory and Applications
Using branch prediction information for near-optimal i-cache leakage

ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Beyond basic region caching: specializing cache structures for high performance and energy conservation

HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Ultra-low-power signaling challenges for subthreshold global interconnects

Integration, the VLSI Journal
Design space exploration of workload-specific last-level caches

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
NBTI-Aware Data Allocation Strategies for Scratchpad Based Embedded Systems

Journal of Electronic Testing: Theory and Applications
Power devil: tool for power gating strategy selection

Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems
Employing circadian rhythms to enhance power and reliability

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Leakage Current Reduction Techniques for 7T SRAM Cell in 45 nm Technology

Wireless Personal Communications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

On-chip caches represent a sizable fraction of the total power consumption of microprocessors. As feature sizes shrink, the dominant component of this power consumption will be leakage. However, during a fixed period of time, the activity in a data cache is only centered on a small subset of the lines. This behavior can be exploited to cut the leakage power of large data caches by putting the cold cache lines into a state preserving, low-power drowsey mode. In this paper, we investigate policies and circuit techniques for implementing drowsy data caches. We show that with simple microarchitectural techniques, about 80%-90% of the data cache lines can be maintained in a drowsy state without affecting performance by more than 0.6%, even though moving lines into and out of a drowsy state incurs a slight performance loss. According to our projections, in a 70-nm complementary metal-oxide-semiconductor process, drowsy data caches will be able to reduce the total leakage energy consumed in the caches by 60%-75%. In addition, we extend the drowsy cache concept to reduce leakage power of instruction caches without significant impact on execution time. Our results show that data and instruction caches require different control strategies for efficient execution. In order to enable drowsy instruction caches, we propose a technique called cache subbank prediction, which is used to selectively wake up only the necessary parts of the instruction cache, while allowing most of the cache to stay in a low-leakage drowsy mode. This prediction technique reduces the negative performance impact by 78% compared with the no-prediction policy. Our technique works well even with small predictor sizes and enables a 75% reduction of leakage energy in a 32-kB instruction cache.