Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties

  • Authors:
  • Yuan-Shin Hwang;Jia-Jhe Li

  • Affiliations:
  • National Taiwan Ocean University, Keelung, Taiwan;National Tsing Hua University, Hsinchu, Taiwan

  • Venue:
  • ACM Transactions on Architecture and Code Optimization (TACO)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

As transistors keep shrinking and on-chip caches keep growing, static power dissipation resulting from leakage of caches takes an increasing fraction of total power in processors. Several techniques have already been proposed to reduce leakage power by turning off unused cache lines. However, they all have to pay the price of performance degradation. This paper presents a cache architecture, the snug set-associative (SSA) cache, that cuts most of static power dissipation of caches without incuring performance penalties. The SSA cache reduces leakage power by implementing the minimum set-associative scheme, which only activates the minimal numbers of ways in each cache set, while the performance losses caused by this scheme are compensated by the base-offset load/store queues. The rationale of combining these two techniques is locality: as the contents of the cache blocks in the current working set are repeatedly accessed, same addresses would be computed again and again. The SSA cache architecture can be applied to data and instruction caches to reduce leakage power without incurring performance penalties. Experimental results show that SSA can cut static power consumption of the L1 data cache by 93%, on average, for SPECint2000 benchmarks, while the execution times are reduced by 5%. Similarly, SSA can cut leakage dissipation of the L1 instruction cache by 92%, on average, and improve performance over 3%. Furthermore, when SSA is adopted for both L1 data and instruction caches, the normalized leakage of L1 data and instruction caches is lowered to 8%, on average, while still accomplishing a 2% reduction in execution times.