Inexpensive implementations of set-associativity
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A case for two-way skewed-associative caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Efficient detection of all pointer and array access errors
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Skewed associativity enhances performance predictability
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Highly accurate data value prediction using hybrid predictors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The YAGS branch prediction scheme
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Capturing dynamic memory reference behavior with adaptive cache topology
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Speculation techniques for improving load related instruction scheduling
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Picking Statistically Valid and Early Simulation Points
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
General adaptive replacement policies
Proceedings of the 4th international symposium on Memory management
Adaptive Mechanisms and Policies for Managing Cache Hierarchies in Chip Multiprocessors
Proceedings of the 32nd annual international symposium on Computer Architecture
The V-Way Cache: Demand Based Associativity via Global Replacement
Proceedings of the 32nd annual international symposium on Computer Architecture
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A Case for MLP-Aware Cache Replacement
Proceedings of the 33rd annual international symposium on Computer Architecture
BioBench: A Benchmark Suite of Bioinformatics Applications
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Adaptive insertion policies for high performance caching
Proceedings of the 34th annual international symposium on Computer architecture
Less reused filter: improving l2 cache performance via filtering less reused lines
Proceedings of the 23rd international conference on Supercomputing
Divide-and-conquer: a bubble replacement for low level caches
Proceedings of the 23rd international conference on Supercomputing
Exploiting memory soft redundancy for joint improvement of error tolerance and access efficiency
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Cache line reservation: exploring a scheme for cache-friendly object allocation
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Global management of cache hierarchies
Proceedings of the 7th ACM international conference on Computing frontiers
Instruction-based reuse-distance prediction for effective cache management
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
High performance cache replacement using re-reference interval prediction (RRIP)
Proceedings of the 37th annual international symposium on Computer architecture
Dueling CLOCK: adaptive cache replacement policy based on the CLOCK algorithm
Proceedings of the Conference on Design, Automation and Test in Europe
Quality of service shared cache management in chip multiprocessor architecture
ACM Transactions on Architecture and Code Optimization (TACO)
A phase adaptive cache hierarchy for SMT processors
Microprocessors & Microsystems
SHiP: signature-based hit predictor for high performance caching
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Code-based cache partitioning for improving hardware cache performance
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Optimal bypass monitor for high performance last-level caches
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Exploiting reuse locality on inclusive shared last-level caches
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
ARI: Adaptive LLC-memory traffic management
ACM Transactions on Architecture and Code Optimization (TACO)
An effectiveness-based adaptive cache replacement policy
Microprocessors & Microsystems
Hi-index | 0.02 |
We present and evaluate the idea of adaptive processor cache management. Specifically, we describe a novel and general scheme by which we can combine any two cache management algorithms (e.g., LRU, LFU, FIFO, Random) and adaptively switch between them, closely tracking the locality characteristics of a given program. The scheme is inspired by recent work in virtual memory management at the operating system level, which has shown that it is possible to adapt over two replacement policies to provide an aggregate policy that always performs within a constant factor of the better component policy. A hardware implementation of adaptivity requires very simple logic but duplicate tag structures. To reduce the overhead, we use partial tags, which achieve good performance with a small hardware cost. In particular, adapting between LRU and LFU replacement policies on an 8-way 512KB L2 cache yields a 12.7% improvement in average CPI on applications that exhibit a non-negligible L2 miss ratio. Our approach increases total cache storage by 4.0%, but it still provides slightly better performance than a conventional 10-way setassociative 640KB cache which requires 25% more storage.