A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Compiling for numa parallel machines
Compiling for numa parallel machines
The Omega Library interface guide
The Omega Library interface guide
A quantitative analysis of loop nest locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Non-singular data transformations: definition, validity and applications
ICS '97 Proceedings of the 11th international conference on Supercomputing
Improving locality using loop and data transformations in an integrated framework
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
Proceedings of the 14th international conference on Supercomputing
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Storage Management Programmable Process
Storage Management Programmable Process
Vectorizing for a SIMdD DSP architecture
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
MPSoC memory optimization using program transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Buffer and Register Allocation for Memory Space Optimization
Journal of VLSI Signal Processing Systems
Instruction Hints for Super Efficient Data Caches
ICCS 2009 Proceedings of the 9th International Conference on Computational Science
Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications
Journal of Signal Processing Systems
Locality optimizations for jacobi iteration on distributed parallel systems
ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
By examining data reuse patterns of four array-intensive embedded applications, we found that these codes exhibit a significant amount of inter-nest reuse (i. e., the data reuse that occurs between different nests). While traditional compiler techniques that target array-intensive applications can exploit intra-nest data reuse, there has not been much success in the past in taking advantage of internest data reuse. In this paper, we present a compiler strategy that optimizes inter-nest reuse using loop (iteration space) transformations. Our approach captures the impact of execution of a nest on cache contents using an abstraction called footprint vector. Then, it transforms a given nest such that the new (transformed) access pattern reuses the data left in cache by the previous nest in the code. In optimizing inter-nest locality, our approach also tries to achieve good intra-nest locality. Our simulation results indicate large performance improvements. In particular, inter-nest loop optimization generates competitive results with intra-nest loop and data optimizations.