Data cache management using frequency-based replacement
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Cache replacement with dynamic exclusion
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A modified approach to data cache management
Proceedings of the 28th annual international symposium on Microarchitecture
Utilizing reuse information in data cache management
ICS '98 Proceedings of the 12th international conference on Supercomputing
IEEE Transactions on Computers
Using the Compiler to Improve Cache Replacement Decisions
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Design and performance evaluation of a cache assist to implement selective caching
ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
Self-correcting LRU replacement policies
Proceedings of the 1st conference on Computing frontiers
Counter-Based Cache Replacement Algorithms
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Less reused filter: improving l2 cache performance via filtering less reused lines
Proceedings of the 23rd international conference on Supercomputing
DIEF: an accurate interference feedback mechanism for chip multiprocessor memory systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
This paper proposes a new replacement algorithm to protect cache lines with potential future reuse from being evicted. In contrast to the recency based approaches used in the past (LRU for example), our algorithm also uses the notion of frequency of access. Instead of evicting the least recently used block, our algorithm identifies among a set of LRU blocks the one that is also least-frequently-used (according to a heuristic) and chooses that as a victim. We have implemented this replacement algorithm in a detailed simulation model of a chip multiprocessor system driven by SPEC2000 benchmarks. We have found that the new scheme improves performance for memory intensive applications. Moreover, as compared to other attempts, our replacement algorithm provides robust improvements across all benchmarks. We have also extended an earlier scheme proposed by Wong and Baer so it is switched off when performance is not improved. Our results show that this makes the scheme much more suitable for CMP configurations.