Less reused filter: improving l2 cache performance via filtering less reused lines

  • Authors:
  • Lingxiang Xiang;Tianzhou Chen;Qingsong Shi;Wei Hu

  • Affiliations:
  • Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China

  • Venue:
  • Proceedings of the 23rd international conference on Supercomputing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The L2 cache is commonly managed using LRU policy. For workloads that have a working set larger than L2 cache, LRU behaves poorly, resulting in a great number of less reused lines that are never reused or reused for few times. In this case, the cache performance can be improved through retaining a portion of working set in cache for a period long enough. Previous schemes approach this by bypassing never reused lines. Nevertheless, severely constrained by the number of never reused lines, sometimes they deliver no benefit due to the lack of never reused lines. This paper proposes a new filtering mechanism that filters out the less reused lines rather than just never reused lines. The extended scope of bypassing provides more opportunities to fit the working set into cache. This paper also proposes a Less Reused Filter (LRF), a separate structure that precedes L2 cache, to implement the above mechanism. LRF employs a reuse frequency predictor to accurately identify the less reused lines from incoming lines. Meanwhile, based on our observation that most less reused lines have a short life span, LRF places the filtered lines into a small filter buffer to fully utilize them, avoiding extra misses. Our evaluation, for 24 SPEC 2000 benchmarks, shows that augmenting a 512KB LRU-managed L2 cache with a LRF having 32KB filter buffer reduces the average MPKI by 27.5%, narrowing the gap between LRU and OPT by 74.4%.