Analytical cache models with applications to cache partitioning
ICS '01 Proceedings of the 15th international conference on Supercomputing
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Dynamic hot data stream prefetching for general-purpose programs
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Flexible reference trace reduction for VM simulations
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Estimating cache misses and locality using stack distances
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Fast data-locality profiling of native execution
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Generating cache hints for improved program efficiency
Journal of Systems Architecture: the EUROMICRO Journal
Multiple Page Size Modeling and Optimization
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Accurate, efficient, and adaptive calling context profiling
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
Locality approximation using time
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Shadow Profiling: Hiding Instrumentation Costs with Parallelism
Proceedings of the International Symposium on Code Generation and Optimization
SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance
Proceedings of the International Symposium on Code Generation and Optimization
All-window profiling of concurrent executions
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Sampling-based program locality approximation
Proceedings of the 7th international symposium on Memory management
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Program locality analysis using reuse distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Addressing shared resource contention in multicore processors via scheduling
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Static reuse distances for locality-based optimizations in MATLAB
Proceedings of the 24th ACM International Conference on Supercomputing
Accelerating multicore reuse distance analysis with sampling and parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
All-window profiling and composable models of cache sharing
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Fast modeling of shared caches in multicore systems
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Discovery of locality-improving refactorings by reuse path analysis
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Linear-time Modeling of Program Working Set in Shared Cache
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Combining locality analysis with online proactive job co-scheduling in chip multiprocessors
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Is reuse distance applicable to data locality analysis on chip multiprocessors?
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Path-Based reuse distance analysis
CC'06 Proceedings of the 15th international conference on Compiler Construction
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Cache Conscious Task Regrouping on Multicore Processors
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Locality Principle Revisited: A Probability-Based Quantitative Approach
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
A Highly Parallel Reuse Distance Analysis Algorithm on GPUs
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
PARDA: A Fast Parallel Reuse Distance Analysis Algorithm
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Hi-index | 0.00 |
The locality metrics are many, for example, miss ratio to test performance, data footprint to manage cache sharing, and reuse distance to analyze and optimize a program. It is unclear how different metrics are related, whether one subsumes another, and what combination may represent locality completely. This paper first derives a set of formulas to convert between five locality metrics and gives the condition for correctness. The transformation is analogous to differentiation and integration used to convert between higher order polynomials. As a result, these metrics can be assigned an order and organized into a hierarchy. Using the new theory, the paper then develops two techniques: one measures the locality in real time without special hardware support, and the other predicts multicore cache interference without parallel testing. The paper evaluates them using sequential and parallel programs as well as for a parallel mix of sequential programs.