Static reuse distances for locality-based optimizations in MATLAB

Authors:
Arun Chauhan;Chun-Yu Shei
Affiliations:
School of Informatics and Computing, Bloomington, IN;School of Informatics and Computing, Bloomington, IN
Venue:
Proceedings of the 24th ACM International Conference on Supercomputing
Year:
2010

Citing 17
Cited 6

A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Techniques for the translation of MATLAB programs into Fortran 90

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatically tuned linear algebra software

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Reducing and Vectorizing Procedures for Telescoping Languages

International Journal of Parallel Programming
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Estimating cache misses and locality using stack distances

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Cache-Oblivious Algorithms

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
The Memory Bandwidth Bottleneck and its Amelioration by a Compiler

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
An apl machine

An apl machine
An experimental comparison of cache-oblivious and cache-conscious programs

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
A Dimension Abstraction Approach to Vectorization in Matlab

Proceedings of the International Symposium on Code Generation and Optimization
Representation-transparent matrix algorithms with scalable performance

Proceedings of the 21st annual international conference on Supercomputing
Experimental algorithmics

Communications of the ACM
How much parallelism is there in irregular applications?

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Refactoring for Data Locality

Computer
Evaluation techniques for storage hierarchies

IBM Systems Journal

All-window profiling and composable models of cache sharing

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
On the theory and potential of LRU-MRU collaborative cache management

Proceedings of the international symposium on Memory management
Automated locality optimization based on the reuse distance of string operations

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
A generalized theory of collaborative caching

Proceedings of the 2012 international symposium on Memory Management
Accurate prediction of the behavior of multithreaded applications in shared caches

Parallel Computing
HOTL: a higher order theory of locality

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

The problem of modeling memory locality of applications to guide compiler optimizations in a systematic manner is an important unsolved problem, made even more significant with the advent of multi-core and many-core architectures. We describe an approach based on a novel source-level metric, called static reuse distance, to model the memory behavior of applications written in matlab. We use matlab as a representative language that lets end-users express their algorithms precisely, but at a relatively high level. Matlab's "high-level" characteristics allow the static analysis to focus on large objects, such as arrays, without losing accuracy due to processor-specific layout of scalar values in memory. We present an efficient algorithm to compute static reuse distances using an extended version of dependence graphs. Our approach differs from earlier similar attempts in three important aspects: it targets high-level programming systems characterized by heavy use of libraries; it works on full programs, instead of being confined to loops; and it integrates practical mechanisms to handle separately compiled procedures as well as pre-compiled library procedures that are only available in binary form. We study matlab code, taken from real programs, to demonstrate the effectiveness of our model. Finally, we present some applications of our approach to program transformations that are known to be important in matlab, but are expected to be relevant to other similar high level languages as well.