A modified approach to data cache management
Proceedings of the 28th annual international symposium on Microarchitecture
Cache miss heuristics and preloading techniques for general-purpose programs
Proceedings of the 28th annual international symposium on Microarchitecture
PA-RISC 2.0 architecture
A locality sensitive multi-module cache with explicit management
ICS '99 Proceedings of the 13th international conference on Supercomputing
MIST: an algorithm for memory miss traffic management
Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
Software-assisted cache replacement mechanisms for embedded systems
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
The Alpha 21264 Microprocessor
IEEE Micro
Itanium Processor Microarchitecture
IEEE Micro
Cache miss equations: compiler analysis framework for tuning memory behavior
Cache miss equations: compiler analysis framework for tuning memory behavior
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Array regrouping and structure splitting using whole-program reference affinity
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Reuse-distance-based miss-rate prediction on a per instruction basis
MSP '04 Proceedings of the 2004 workshop on Memory system performance
Instruction Based Memory Distance Analysis and its Application
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
A hierarchical model of data locality
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
International Journal of Parallel Programming
Feedback-directed memory disambiguation through store distance analysis
Proceedings of the 20th annual international conference on Supercomputing
Miss Rate Prediction Across Program Inputs and Cache Configurations
IEEE Transactions on Computers
Sampling-based program locality approximation
Proceedings of the 7th international symposium on Memory management
Scalable Implementation of Efficient Locality Approximation
Languages and Compilers for Parallel Computing
P-OPT: Program-Directed Optimal Cache Management
Languages and Compilers for Parallel Computing
Program locality analysis using reuse distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Journal of Signal Processing Systems
ACM Transactions on Embedded Computing Systems (TECS)
On the design of fast prefix-preserving IP address anonymization scheme
ICICS'07 Proceedings of the 9th international conference on Information and communications security
An efficient simulation algorithm for cache of random replacement policy
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
On the theory and potential of LRU-MRU collaborative cache management
Proceedings of the international symposium on Memory management
Dynamic access distance driven cache replacement
ACM Transactions on Architecture and Code Optimization (TACO)
The periodic-linear model of program behavior capture
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Path-Based reuse distance analysis
CC'06 Proceedings of the 15th international conference on Compiler Construction
Automated locality optimization based on the reuse distance of string operations
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
A generalized theory of collaborative caching
Proceedings of the 2012 international symposium on Memory Management
Pacman: program-assisted cache management
Proceedings of the 2013 international symposium on memory management
Toward application-specific memory reconfiguration for energy efficiency
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
Hi-index | 0.00 |
Modern instruction sets extend their load/store-instructions with cache hints, as an additional means to bridge the processor-memory speed gap. Cache hints are used to specify the cache level at which the data is likely to be found, as well as the cache level where the data is stored after accessing it. In order to improve a program's cache behavior, the cache hint is selected based on the data locality of the instruction. We represent the data locality of an instruction by its reuse distance distribution. The reuse distance is the amount of data addressed between two accesses to the same memory location. The distribution allows to efficiently estimate the cache level where the data will be found, and to determine the level where the data should be stored to improve the hit rate. The Open64 EPIC-compiler was extended with cache hint selection and resulted in speedups of up to 36% in numerical and 23% in nonnumerical programs on an Itanium multiprocessor.