Data cache management using frequency-based replacement
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The cache memory book
The LRU-K page replacement algorithm for database disk buffering
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Self-correcting LRU replacement policies
Proceedings of the 1st conference on Computing frontiers
ARC: A Self-Tuning, Low Overhead Replacement Cache
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
CAR: Clock with Adaptive Replacement
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive insertion policies for high performance caching
Proceedings of the 34th annual international symposium on Computer architecture
A study of replacement algorithms for a virtual-storage computer
IBM Systems Journal
ARI: Adaptive LLC-memory traffic management
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
We consider the problem of on-chip L2 cache management and replacement policies. We propose a new adaptive cache replacement policy, called Dueling CLOCK (DC), that has several advantages over the Least Recently Used (LRU) cache replacement policy. LRU's strength is that it keeps track of the 'recency' information of memory accesses. However, a) LRU has a high overhead cost of moving cache blocks into the most recently used position each time a cache block is accessed; b) LRU does not exploit 'frequency' information of memory accesses; and, c) LRU is prone to cache pollution when a sequence of single-use memory accesses that are larger than the cache size is fetched from memory (i.e., it is non scan resistant). The DC policy was developed to have low overhead cost, to capture 'recency' information in memory accesses, to exploit the 'frequency' pattern of memory accesses and to be scan resistant. In this paper, we propose a hardware implementation of the CLOCK algorithm for use within an on-chip cache controller to ensure low overhead cost. We then present the DC policy, which is an adaptive replacement policy that alternates between the CLOCK algorithm and the scan resistant version of the CLOCK algorithm. We present experimental results showing the MPKI (Misses per thousand instructions) comparison of DC against existing replacement policies, such as LRU. The results for an 8-way 1MB L2 cache show that DC can lower the MPKI of SPEC CPU2000 benchmark by an average of 10.6% when compared to the tree based Pseudo-LRU cache replacement policy.