Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Using the Compiler to Improve Cache Replacement Decisions
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Reuse Distance-Based Cache Hint Selection
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
The EELRU adaptive replacement algorithm
Performance Evaluation
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Dynamic tracking of page miss ratio curve for memory management
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Automatic pool allocation: improving performance by controlling data structure layout in the heap
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
IEEE Transactions on Computers
Generating cache hints for improved program efficiency
Journal of Systems Architecture: the EUROMICRO Journal
Instruction Based Memory Distance Analysis and its Application
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Adaptive insertion policies for high performance caching
Proceedings of the 34th annual international symposium on Computer architecture
CRAMM: virtual memory support for garbage-collected applications
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
P-OPT: Program-Directed Optimal Cache Management
Languages and Compilers for Parallel Computing
Less reused filter: improving l2 cache performance via filtering less reused lines
Proceedings of the 23rd international conference on Supercomputing
Cross-Input Learning and Discriminative Prediction in Evolvable Virtual Machines
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
A study of replacement algorithms for a virtual-storage computer
IBM Systems Journal
Evaluation techniques for storage hierarchies
IBM Systems Journal
Z-rays: divide arrays and conquer speed and flexibility
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
On the theory and potential of LRU-MRU collaborative cache management
Proceedings of the international symposium on Memory management
Characterization and dynamic mitigation of intra-application cache interference
ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
Dynamic access distance driven cache replacement
ACM Transactions on Architecture and Code Optimization (TACO)
Why nothing matters: the impact of zeroing
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Automated locality optimization based on the reuse distance of string operations
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
A generalized theory of collaborative caching
Proceedings of the 2012 international symposium on Memory Management
Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Hi-index | 0.00 |
As caches become larger and shared by an increasing number of cores, cache management is becoming more important. This paper explores collaborative caching, which uses software hints to influence hardware caching. Recent studies have shown that such collaboration between software and hardware can theoretically achieve optimal cache replacement on LRU-like cache. This paper presents Pacman, a practical solution for collaborative caching in loop-based code. Pacman uses profiling to analyze patterns in an optimal caching policy in order to determine which data to cache and at what time. It then splits each loop into different parts at compile time. At run time, the loop boundary is adjusted to selectively store data that would be stored in an optimal policy. In this way, Pacman emulates the optimal policy wherever it can. Pacman requires a single bit at the load and store instructions. Some of the current hardware has partial support. This paper presents results using both simulated and real systems, and compares simulated results to related caching policies.