Path-Based reuse distance analysis

Authors:
Changpeng Fang;Steve Carr;Soner Önder;Zhenlin Wang
Affiliations:
PathScale, Inc., Mountain View, CA;Department of Computer Science, Michigan Technological University, Houghton, MI;Department of Computer Science, Michigan Technological University, Houghton, MI;Department of Computer Science, Michigan Technological University, Houghton, MI
Venue:
CC'06 Proceedings of the 15th international conference on Compiler Construction
Year:
2006

Citing 25
Cited 3

Practical dependence testing

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
A practical algorithm for exact array dependence analysis

Communications of the ACM
Efficient simulation of caches under optimal replacement with applications to miss characterization

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient path profiling

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Predicting data cache misses in non-numeric applications through correlation profiling

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
Improving data-flow analysis with path profiles

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Whole program paths

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Reuse Distance-Based Cache Hint Selection

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Calculating stack distances efficiently

Proceedings of the 2002 workshop on Memory system performance
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Estimating cache misses and locality using stack distances

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Improving effective bandwidth through compiler enhancement of global and dynamic cache reuse

Improving effective bandwidth through compiler enhancement of global and dynamic cache reuse
Miss Rate Prediction across All Program Inputs

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Array regrouping and structure splitting using whole-program reference affinity

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Cross-architecture performance predictions for scientific applications using parameterized models

Proceedings of the joint international conference on Measurement and modeling of computer systems
Locality phase prediction

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Reuse-distance-based miss-rate prediction on a per instruction basis

MSP '04 Proceedings of the 2004 workshop on Memory system performance
Generating cache hints for improved program efficiency

Journal of Systems Architecture: the EUROMICRO Journal
Instruction Based Memory Distance Analysis and its Application

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research

IEEE Computer Architecture Letters
Evaluation techniques for storage hierarchies

IBM Systems Journal
RDVIS: a tool that visualizes the causes of low locality and hints program optimizations

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II

Discovery of locality-improving refactorings by reuse path analysis

HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
HOTL: a higher order theory of locality

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Imbalanced cache partitioning for balanced data-parallel programs

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Profiling can effectively analyze program behavior and provide critical information for feedback-directed or dynamic optimizations. Based on memory profiling, reuse distance analysis has shown much promise in predicting data locality for a program using inputs other than the profiled ones. Both whole-program and instruction-based locality can be accurately predicted by reuse distance analysis. Reuse distance analysis abstracts a cluster of memory references for a particular instruction having similar reuse distance values into a locality pattern. Prior work has shown that a significant number of memory instructions have multiple locality patterns, a property not desirable for many instruction-based memory optimizations. This paper investigates the relationship between locality patterns and execution paths by analyzing reuse distance distribution along each dynamic path to an instruction. Here a path is defined as the program execution trace from the previous access of a memory location to the current access. By differentiating locality patterns with the context of execution paths, the proposed analysis can expose optimization opportunities tailored only to a specific subset of paths leading to an instruction. In this paper, we present an effective method for path-based reuse distance profiling and analysis. We have observed that a significant percentage of the multiple locality patterns for an instruction can be uniquely related to a particular execution path in the program. In addition, we have also investigated the influence of inputs on reuse distance distribution for each path/instruction pair. The experimental results show that the path-based reuse distance is highly predictable, as a function of the data size, for a set of SPEC CPU2000 programs.