The LRU-K page replacement algorithm for database disk buffering
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Next cache line and set prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Run-time adaptive cache hierarchy management via reference analysis
Proceedings of the 24th annual international symposium on Computer architecture
Exploiting spatial locality in data caches using spatial footprints
Proceedings of the 25th annual international symposium on Computer architecture
Way-predicting set-associative cache for high performance and low energy consumption
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Active Management of Data Caches by Exploiting Reuse Information
IEEE Transactions on Computers
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
IEEE Transactions on Computers
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
Dead-block prediction & dead-block correlating prefetchers
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Timekeeping in the memory system: predicting and optimizing memory behavior
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
IEEE Transactions on Computers
Compiler managed micro-cache bypassing for high performance EPIC processors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
The EELRU adaptive replacement algorithm
Performance Evaluation
Analysis of cache replacement-algorithms
Analysis of cache replacement-algorithms
Proceedings of the 18th annual international conference on Supercomputing
IATAC: a smart predictor to turn-off L2 cache lines
ACM Transactions on Architecture and Code Optimization (TACO)
The V-Way Cache: Demand Based Associativity via Global Replacement
Proceedings of the 32nd annual international symposium on Computer Architecture
ARC: A Self-Tuning, Low Overhead Replacement Cache
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Counter-Based Cache Replacement Algorithms
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A Case for MLP-Aware Cache Replacement
Proceedings of the 33rd annual international symposium on Computer Architecture
MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive insertion policies for high performance caching
Proceedings of the 34th annual international symposium on Computer architecture
L1 Cache Filtering Through Random Selection of Memory References
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Scavenger: A New Last Level Cache Architecture with Global Block Priority
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Emulating Optimal Replacement with a Shepherd Cache
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
On the theory and potential of LRU-MRU collaborative cache management
Proceedings of the international symposium on Memory management
A generalized theory of collaborative caching
Proceedings of the 2012 international symposium on Memory Management
CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures
Proceedings of the 26th ACM international conference on Supercomputing
Unified memory optimizing architecture: memory subsystem control with a unified predictor
Proceedings of the 26th ACM international conference on Supercomputing
Optimal bypass monitor for high performance last-level caches
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Pacman: program-assisted cache management
Proceedings of the 2013 international symposium on memory management
Hi-index | 0.00 |
The L2 cache is commonly managed using LRU policy. For workloads that have a working set larger than L2 cache, LRU behaves poorly, resulting in a great number of less reused lines that are never reused or reused for few times. In this case, the cache performance can be improved through retaining a portion of working set in cache for a period long enough. Previous schemes approach this by bypassing never reused lines. Nevertheless, severely constrained by the number of never reused lines, sometimes they deliver no benefit due to the lack of never reused lines. This paper proposes a new filtering mechanism that filters out the less reused lines rather than just never reused lines. The extended scope of bypassing provides more opportunities to fit the working set into cache. This paper also proposes a Less Reused Filter (LRF), a separate structure that precedes L2 cache, to implement the above mechanism. LRF employs a reuse frequency predictor to accurately identify the less reused lines from incoming lines. Meanwhile, based on our observation that most less reused lines have a short life span, LRF places the filtered lines into a small filter buffer to fully utilize them, avoiding extra misses. Our evaluation, for 24 SPEC 2000 benchmarks, shows that augmenting a 512KB LRU-managed L2 cache with a LRF having 32KB filter buffer reduces the average MPKI by 27.5%, narrowing the gap between LRU and OPT by 74.4%.