Precise miss analysis for program transformations with caches of arbitrary associativity

  • Authors:
  • Somnath Ghosh;Margaret Martonosi;Sharad Malik

  • Affiliations:
  • Department of Electrical Engineering, Princeton University, Princeton, NJ;Department of Electrical Engineering, Princeton University, Princeton, NJ;Department of Electrical Engineering, Princeton University, Princeton, NJ

  • Venue:
  • Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Analyzing and optimizing program memory performance is a pressing problem in high-performance computer architectures. Currently, software solutions addressing the processor-memory performance gap include compiler-or programmer-applied optimizations like data structure padding, matrix blocking, and other program transformations. Compiler optimization can be effective, but the lack of precise analysis and optimization frameworks makes it impossible to confidently make optimal, rather than heuristic-based, program transformations. Imprecision is most problematic in situations where hard-to-predict cache conflicts foil heuristic approaches. Furthermore, the lack of a general framework for compiler memory performance analysis makes it impossible to understand the combined effects of several program transformations.The Cache Miss Equation (CME) framework discussed in this paper addresses these issues. We express memory reference and cache conflict behavior in terms of sets of equations. The mathematical precision of CMEs allows us to find true optimal solutions for transformations like blocking or padding. The generality of CMEs also allows us to reason about interactions between transformations applied in concert. Unlike our prior work, this framework applies to caches of arbitrary associativity. This paper also demonstrates the utility of CMEs by presenting precise algorithms for intra-variable padding, inter-variable padding, and selecting tile sizes. Our experiences with CMEs implemented in the SUIF system show that they are a unifying mathematical framework offering the generality and precision imperative for compiler optimizations on current high-performance architectures.