Cache miss equations: a compiler framework for analyzing and tuning memory behavior

  • Authors:
  • Somnath Ghosh;Margaret Martonosi;Sharad Malik

  • Affiliations:
  • Princeton Univ., Princeton, NJ;Princeton Univ., Princeton, NJ;Princeton Univ., Princeton, NJ

  • Venue:
  • ACM Transactions on Programming Languages and Systems (TOPLAS)
  • Year:
  • 1999

Quantified Score

Hi-index 0.02

Visualization

Abstract

With the ever-widening performance gap between processors and main memory, cache memory, which is used to bridge this gap, is becoming more and more significant. Caches work well for programs that exhibit sufficient locality. Other programs, however, have reference patterns that fail to exploit the cache, thereby suffering heavily from high memory latency. In order to get high cache efficiency and achieve good program performance, efficient memory accessing behavior is necessary. In fact, for many programs, program transformations or source-code changes can radically alter memory access patterns, significantly improving cache performance. Both hand-tuning and compiler optimization techniques are often used to transform codes to improve cache utilization. Unfortunately, cache conflicts are difficult to predict and estimate, precluding effective transformations. Hence, effective transformations require detailed knowledge about the frequency and causes of cache misses in the code. This article describes methods for generating and solving Cache Miss Equations (CMEs) that give a detailed representation of cache behavior, including conflict misses, in loop-oriented scientific code. Implemented within the SUIF compiler framework, our approach extends traditional compiler reuse analysis to generate linear Diophantine equations that summarize each loop's memory behavior. While solving these equations is in general difficult, we show that is also unnecessary, as mathematical techniques for manipulating Diophantine equations allow us to relatively easily compute and/or reduce the number of possible solutions, where each solution corresponds to a potential cache miss. The mathematical precision of CMEs allows us to find true optimal solutions for transformations such as blocking or padding. The generality of CMEs also allows us to reason about interactions between transformations applied in concert. The article also gives examples of their use to determine array padding and offset amounts that minimize cache misses, and to determine optimal blocking factors for tiled code. Overall, these equations represent an analysis framework that offers the generality and precision needed for detailed compiler optimizations.