Where replacement algorithms fail: a thorough analysis

  • Authors:
  • Georgios Keramidas;Pavlos Petoumenos;Stefanos Kaxiras

  • Affiliations:
  • Industrial Systems Institute, Patras, Greece;University of Patras, Patras, Greece;University of Patras, Patrras, Greece

  • Venue:
  • Proceedings of the 7th ACM international conference on Computing frontiers
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cache placement and eviction, especially at the last level of the memory hierarchy, have received a flurry of research activity recently. The common perception that LRU is a well-performing algorithm has recently been discredited: many researchers have turned their attention to more sophisticated algorithms that are able to substantially improve cache performance. In this paper, we thoroughly examine four recently proposed replacement policies: the Dynamic Insertion Policy (DIP), the Shepherd Cache (SC), the MLP-aware replacement, and the Instruction-based Reuse Distance Prediction (IbRDP) replacement policy. Our experimental studies show that there is a great inconsistency between the number of misses saved by each mechanism and the resulting improvement in IPC. This is particularly true for the DIP and the SC approach and indeed attest to the fact that these algorithms do not take into account the relative cost of each miss (i.e., whether it is an isolated or parallel miss). Their aim is to blindly lower the total number of misses. On the other hand, the MLP-aware replacement, although miss-cost-aware, cannot handle efficiently workloads which display LRU-hostile behavior and thus fails to reduce execution time even when there are ample opportunities to reduce cache misses. The IbRDP replacement policy shows both the ability to deal with non-LRU access patterns and MLP friendliness leading to greater consistency between the reduction of misses and the corresponding increase in performance thus the largest IPC improvement among the studied mechanisms. So, what are the appropriate characteristics of a replacement algorithm targeting the lower levels of the memory hierarchy? In this paper we are shedding some light on this question.