Performance tradeoffs in cache design
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
ProfileMe: hardware support for instruction-level profiling on out-of-order processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
A static power model for architects
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Performance analysis using the MIPS R10000 performance counters
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing set-associative cache energy via way-prediction and selective direct-mapping
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The Alpha 21264 Microprocessor
IEEE Micro
Design Challenges of Technology Scaling
IEEE Micro
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
Adaptive mode control: A static-power-efficient cache design
ACM Transactions on Embedded Computing Systems (TECS)
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Reducing data cache energy consumption via cached load/store queue
Proceedings of the 2003 international symposium on Low power electronics and design
Exploiting program hotspots and code sequentiality for instruction cache leakage management
Proceedings of the 2003 international symposium on Low power electronics and design
Static Energy Reduction Techniques for Microprocessor Caches
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Reducing Design Complexity of the Load/Store Queue
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Circuit and microarchitectural techniques for reducing cache leakage power
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Single-vDD and single-vT super-drowsy techniques for low-leakage high-performance instruction caches
Proceedings of the 2004 international symposium on Low power electronics and design
HotSpot cache: joint temporal and spatial locality exploitation for i-cache energy reduction
Proceedings of the 2004 international symposium on Low power electronics and design
A way-halting cache for low-energy high-performance systems
Proceedings of the 2004 international symposium on Low power electronics and design
Soft error and energy consumption interactions: a data cache perspective
Proceedings of the 2004 international symposium on Low power electronics and design
Static next sub-bank prediction for drowsy instruction cache
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
On the Limits of Leakage Power Reduction in Caches
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Signature Buffer: Bridging Performance Gap between Registers and Caches
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
A simple mechanism to adapt leakage-control policies to temperature
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Structured Computer Organization (5th Edition)
Structured Computer Organization (5th Edition)
On reducing load/store latencies of cache accesses
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
As transistors keep shrinking and on-chip caches keep growing, static power dissipation resulting from leakage of caches takes an increasing fraction of total power in processors. Several techniques have already been proposed to reduce leakage power by turning off unused cache lines. However, they all have to pay the price of performance degradation. This paper presents a cache architecture, the snug set-associative (SSA) cache, that cuts most of static power dissipation of caches without incuring performance penalties. The SSA cache reduces leakage power by implementing the minimum set-associative scheme, which only activates the minimal numbers of ways in each cache set, while the performance losses caused by this scheme are compensated by the base-offset load/store queues. The rationale of combining these two techniques is locality: as the contents of the cache blocks in the current working set are repeatedly accessed, same addresses would be computed again and again. The SSA cache architecture can be applied to data and instruction caches to reduce leakage power without incurring performance penalties. Experimental results show that SSA can cut static power consumption of the L1 data cache by 93%, on average, for SPECint2000 benchmarks, while the execution times are reduced by 5%. Similarly, SSA can cut leakage dissipation of the L1 instruction cache by 92%, on average, and improve performance over 3%. Furthermore, when SSA is adopted for both L1 data and instruction caches, the normalized leakage of L1 data and instruction caches is lowered to 8%, on average, while still accomplishing a 2% reduction in execution times.