Analyzing data reuse for cache reconfiguration

Authors:
J. Hu;M. Kandemir;N. Vijaykrishnan;M. J. Irwin
Affiliations:
New Jersey Institute of Technology, Newark, NJ;The Pennsylvania State University, University Park, PA;The Pennsylvania State University, University Park, PA;The Pennsylvania State University, University Park, PA
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2005

Citing 20
Cited 4

Strategies for cache and local memory management by global program transformation

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Improving the cache locality of memory allocation

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Shade: a fast instruction-set simulator for execution profiling

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Cache interference phenomena

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Instruction level power analysis and optimization of software

Journal of VLSI Signal Processing Systems - Special issue on technologies for wireless computing
Cache miss equations: an analytical representation of cache misses

ICS '97 Proceedings of the 11th international conference on Supercomputing
Eliminating conflict misses for high performance architectures

ICS '98 Proceedings of the 12th international conference on Supercomputing
Precise miss analysis for program transformations with caches of arbitrary associativity

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
New tiling techniques to improve cache temporal locality

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Energy-efficient design of battery-powered embedded systems

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Selective cache ways: on-demand cache resource allocation

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests

Proceedings of the 14th international conference on Supercomputing
Reconfigurable caches and their application to media processing

Proceedings of the 27th annual international symposium on Computer architecture
Interface and cache power exploration for core-based embedded system design

ICCAD '99 Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design
Transforming loops to recursion for multi-level memory hierarchies

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Morphable Cache Architectures: Potential Benefits

OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
A compiler analysis of cache interference and its applications to compiler optimizations

A compiler analysis of cache interference and its applications to compiler optimizations

Eliminating inter-process cache interference through cache reconfigurability for real-time and low-power embedded multi-tasking systems

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
A compiler approach to managing storage and memory bandwidth in configurable architectures

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Instruction Hints for Super Efficient Data Caches

ICCS 2009 Proceedings of the 9th International Conference on Computational Science
Cache partitioning for energy-efficient and interference-free embedded multitasking

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classical compiler optimizations assume a fixed cache architecture and modify the program to take best advantage of it. In some cases, this may not be the best strategy because each nest might work best with a different cache configuration and transforming a nest for a given fixed cache configuration may not be possible due to data and control dependences. Working with a fixed cache configuration can also increase energy consumption in loops where the best required configuration is smaller than the default (fixed) one. In this paper, we take an alternate approach and modify the cache configuration for each nest, depending on the access pattern exhibited by the nest. We call this technique compiler-directed cache polymorphism (CDCP). More specifically, in this paper, we make the following contributions. First, we present an approach for analyzing data reuse properties of loop nests. Second, we give algorithms to simulate the footprints of array references in their reuse space. Third, based on our reuse analysis, we present an optimization algorithm to compute the cache configurations for each loop nest. Our experimental results show that CDCP is very effective in finding the near-optimal data cache configurations for different nests in array-intensive applications.