ASR: Adaptive Selective Replication for CMP Caches

  • Authors:
  • Bradford M. Beckmann;Michael R. Marty;David A. Wood

  • Affiliations:
  • Microsoft Corporation;University of Wisconsin, Madison;University of Wisconsin, Madison

  • Venue:
  • Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The large working sets of commercial and scientific workloads stress the L2 caches of Chip Multiprocessors (CMPs). Some CMPs use a shared L2 cache to maximize the on-chip cache capacity and minimize off-chip misses. Others use private L2 caches, replicating data to limit the delay due to global wires and minimize cache access time. Recent hybrid proposals use selective replication to balance latency and capacity, but their static replication rules result in performance degradation for some combinations of workloads and system configurations. This paper proposes Adaptive Selective Replication (ASR), a mechanism that dynamically monitors workload behavior to control replication. ASR replicates cache blocks only when it estimates the benefit of replication (lower L2 hit latency) exceeds the cost (more L2 misses). Full-system simulations of 8-processor CMPs show that ASR provides robust performance: improving performance by as much as 29% versus shared caches, 19% versus private caches, and 12% versus CMP-NuRapid [9] and Victim Replication [41]. Furthermore, while ASR does not improve the performance of all workloads, it provides performance stability by always performing at least comparably to the best alternative including Cooperative Caching [8].