A Case for MLP-Aware Cache Replacement

  • Authors:
  • Moinuddin K. Qureshi;Daniel N. Lynch;Onur Mutlu;Yale N. Patt

  • Affiliations:
  • University of Texas at Austin;University of Texas at Austin;University of Texas at Austin;University of Texas at Austin

  • Venue:
  • Proceedings of the 33rd annual international symposium on Computer Architecture
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Performance loss due to long-latency memory accesses can be reduced by servicing multiple memory accesses concurrently. The notion of generating and servicing long-latency cache misses in parallel is called Memory Level Parallelism (MLP). MLP is not uniform across cache misses - some misses occur in isolation while some occur in parallel with other misses. Isolated misses are more costly on performance than parallel misses. However, traditional cache replacement is not aware of the MLP-dependent cost differential between different misses. Cache replacement, if made MLP-aware, can improve performance by reducing the number of performance-critical isolated misses. This paper makes two key contributions. First, it proposes a framework for MLP-aware cache replacement by using a runtime technique to compute the MLP-based cost for each cache miss. It then describes a simple cache replacement mechanism that takes both MLP-based cost and recency into account. Second, it proposes a novel, low-hardware overhead mechanism called Sampling Based Adaptive Replacement (SBAR), to dynamically choose between an MLP-aware and a traditional replacement policy, depending on which one is more effective at reducing the number of memory related stalls. Evaluations with the SPEC CPU2000 benchmarks show that MLP-aware cache replacement can improve performance by as much as 23%.