Enhanced adaptive insertion policy for shared caches

  • Authors:
  • Chongmin Li;Dongsheng Wang;Yibo Xue;Haixia Wang;Xi Zhang

  • Affiliations:
  • Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science & Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science & Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science & Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science & Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science & Technology, Tsinghua University, Beijing, China

  • Venue:
  • APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The LRU replacement policy is commonly used in the lastlevel caches of multiprocessors. However, LRU policy does not work well for memory intensive workloads which working set are greater than the available cache size. When a new arrival cache block is inserted at the MRU position, it may never be reused until being evicted from the cache but occupy the cache space for a long time during its movement from the MRU to the LRU position. This results in inefficient use of cache space. If we insert a new cache block at the LRU position directly, the cache performance can be improved by keeping some fraction of the working sets is retained in the caches. In this work, we propose Enhanced Dynamic Insertion Policy (EDIP) and Thread Aware Enhanced Dynamic Insertion Policy (TAEDIP) which can adjust the probability of insertion at MRU by set dueling. The runtime information of the previous and the next BIP level are gathered and compared with current level to choose an appropriate BIP level. At the same time, access frequency is used to choose a victim. In this way, our work can get less miss rate than LRU for workloads with large work set. For workloads with small working set, the miss rate of our design is close to LRU replacement policy. Simulation results in single core configuration with 1MB 16-way LLC show that EDIP reduces CPI over LRU and DIP by an average of 11.4% and 1.8% respectively. On quad-core configuration with 4MB 16-way LLC. TAEDIP improves the performance on the weighted speedup metric by 11.2% over LRU and 3.7% over TADIP on average. For fairness metric, TAEDIP improves the performance by 11.2% over LRU and 2.6% over TADIP on average.