Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Life is CMOS: why chase the life after?
Proceedings of the 39th annual Design Automation Conference
DRG-cache: a data retention gated-ground cache for low power
Proceedings of the 39th annual Design Automation Conference
Dynamic fine-grain leakage reduction using leakage-biased bitlines
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Dynamic Vt SRAM: a leakage tolerant cache memory for low voltage microprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
Adaptive Mode Control: A Static-Power-Efficient Cache Design
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Leakage Energy Management in Cache Hierarchies
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Managing static leakage energy in microprocessor functional units
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
An Architectural Level Energy Reduction Technique For Deep-Submicron Cache Memories
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Shade: A Fast Instruction Set Simulator for Execution Profiling
Shade: A Fast Instruction Set Simulator for Execution Profiling
Static next sub-bank prediction for drowsy instruction cache
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Power gating strategies on GPUs
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
According to the International Technology Roadmap for Semiconductors (ITRS), the minimum feature size for microprocessors will shrink to 40 nm by 2010. Leakage currents in devices fabricated at these dimensions have been shown to be so dominant that design methodologies driven by power budgets will face challenges in reducing static power in addition to active power. An effective solution to tackle static power is to transition devices to a low-static-power sleep mode using special circuit-level techniques. However, these transitions come with energy costs, and as these techniques are perfected, and devices transition more often to sleep state, the relative contribution of transition energy to total energy will increase. To deal with the transition overhead, often used techniques are history-based and concentrate only on recognizing when to transition, but do not provide for reducing total transitions without adversely effecting the total sleep time of the devices. In this paper, we study transition-overhead reduction in associative instruction caches. We take advantage of the fact that many programs, particularly those for multimedia applications, spend most of their time in loops and most execution is near-sequential (high spatial locality). We present a technique called DRU (Distance-based Recent Use), which constrains near-sequential fetches to a single bank from the set of associative banks. Evaluation of DRU for different replacement policies in a system-level environment using Mediabench's applications and with various processor architectures (including SPARC and MIPS) have shown energy savings between 20%-28% with negligible hardware and timing overheads.