Reducing leakage in a high-performance deep-submicron instruction cache

Authors:
Michael Powell;Se-Hyun Yang;Babak Falsafi;Kaushik Roy;T. N. Vijaykumar
Affiliations:
Carnegie Mellon Univ., Pittsburgh, PA;Carnegie Mellon Univ., Pittsburgh, PA;Carnegie Mellon Univ., Pittsburgh, PA;Carnegie Mellon Univ., Pittsburgh, PA;Carnegie Mellon Univ., Pittsburgh, PA
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Year:
2001

Citing 0
Cited 29

Design limitations in deep sub-0.1&mgr;m CMOS SRAM

Proceedings of the 12th ACM Great Lakes symposium on VLSI
Fine-grain CAM-tag cache resizing using miss tags

Proceedings of the 2002 international symposium on Low power electronics and design
Energy-aware design of embedded memories: A survey of technologies, architectures, and optimization techniques

ACM Transactions on Embedded Computing Systems (TECS)
Compiler-directed instruction cache leakage optimization

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A compiler approach for reducing data cache energy

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Design methodology for fine-grained leakage control in MTCMOS

Proceedings of the 2003 international symposium on Low power electronics and design
Exploiting program hotspots and code sequentiality for instruction cache leakage management

Proceedings of the 2003 international symposium on Low power electronics and design
Energy optimization techniques in cluster interconnects

Proceedings of the 2003 international symposium on Low power electronics and design
Dynamically Tuning Processor Resources with Adaptive Processing

Computer
Leakage Current: Moore's Law Meets Static Power

Computer
Reducing instruction cache energy consumption using a compiler-based strategy

ACM Transactions on Architecture and Code Optimization (TACO)
Soft error and energy consumption interactions: a data cache perspective

Proceedings of the 2004 international symposium on Low power electronics and design
Static next sub-bank prediction for drowsy instruction cache

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Energy management in software-controlled multi-level memory hierarchies

GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
A Holistic Approach to Designing Energy-Efficient Cluster Interconnects

IEEE Transactions on Computers
Reducing data cache leakage energy using a compiler-based approach

ACM Transactions on Embedded Computing Systems (TECS)
Exploring the limits of leakage power reduction in caches

ACM Transactions on Architecture and Code Optimization (TACO)
Power reduction techniques for microprocessor systems

ACM Computing Surveys (CSUR)
STV-Cache: a leakage energy-efficient architecture for data caches

GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Exploiting loop behavior for data cache leakage reduction

Journal of Embedded Computing - Cache exploitation in embedded systems
Reducing branch predictor leakage energy by exploiting loops

ACM Transactions on Embedded Computing Systems (TECS) - SPECIAL ISSUE SCOPES 2005
Reducing leakage in power-saving capable caches for embedded systems by using a filter cache

MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Compiler-guided next sub-bank prediction for reducing instruction cache leakage energy

Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
Multi-processor computer system having low power consumption

PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
Compiler-guided leakage optimization for banked scratch-pad memories

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Power-aware dynamic cache partitioning for CMPs

Transactions on high-performance embedded architectures and compilers III
Dynamic voltage scaling for power aware fast fourier transform (FFT) processor

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Energy-optimal caches with guaranteed lifetime

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
A survey on cache tuning from a power/energy perspective

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deep-submicron CMOS designs maintain high transistor switching speeds by scaling down the supply voltage and proportionately reducing the transistor threshold voltage. Lowering the threshold voltage increases leakage energy dissipation due to subthreshold leakage current even when the transistor is not switching. Estimates suggest a five-fold increase in leakage energy in every future generation. In modern microarchitectures, much of the leakage energy is dissipated in large on-chip cache memory structures with high transistor densities. While cache utilization varies both within and across applications, modern cache designs are fixed in size resulting in transistor leakage inefficiencies. This paper explores an integrated architectural and circuit-level approach to reducing leakage energy in instruction caches (i-caches). At the architecture level, we propose the Dynamically ResIzable i-cache (DRI i cache), a novel i-cache design that dynamically resizes and adapts to an application's required size. At the circuit-level, we use gated-V/sub dd/, a novel mechanism that effectively turns off the supply voltage to, and eliminates leakage in, the SRAM cells in a DRI i-cache's unused sections. Architectural and circuit-level simulation results indicate that a DRI i-cache successfully and robustly exploits the cache size variability both within and across applications. Compared to a conventional i-cache using an aggressively-scaled threshold voltage a 64 K DRI i-cache reduces on average both the leakage energy-delay product and cache size by 62%, with less than 4% impact on execution time. Our results also indicate that a wide NMOS dual-V/sub t/ gated-V/sub dd/ transistor with a charge pump offers the best gating implementation and virtually eliminates leakage energy with minimal increase in an SRAM cell read time area as compared to an i-cache with an aggressively-scaled threshold voltage.