On the performance benefits of sharing and privatizing second and third-level cache memories in homogeneous multi-core architectures

  • Authors:
  • Fadi N. Sibai

  • Affiliations:
  • College of Information Technology, UAE University, Al Ain, United Arab Emirates

  • Venue:
  • Microprocessors & Microsystems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The benefits and deficiencies of shared and private caches have been identified by researchers. The performance impact of privatizing or sharing caches on homogeneous multi-core architectures is less understood. This paper investigates the performance impact of cache sharing on a homogeneous same-ISA 16-core processor with private first-level (L1) caches by considering 3 cache models which vary the sharing property of second-level (L2) and third-level (L3) cache banks. It is observed that across many scenarios, the cache privatization's average memory access time improved as the L1 cache miss rate increased and/or the cross-partition interconnect latencies increased. Under uniform memory address distribution, and when the L3 cache miss rate is close to 0, privatizing both L2s and L3s performs best among the 3 cache models. Furthermore, we mathematically demonstrate that when the interconnect's bridge latency is below 264 cycles, privatizing L2 caches beats privatizing both L2 and L3 caches, while the reverse is true for large bridge latencies representing high-traffic and heavy workload applications. For large interconnect delays, the private L2 and L3 model is best. For low to moderate interconnect latencies, and when the L3 miss rate is not close to 0, sharing both L2 and L3 banks among all cores performs best followed by privatizing L2s, while privatizing both L2s and L3s ranks last. Under worst case address distributions, cache privatizing benefits generally increase, and with large bridge latencies, privatizing L2 and L3 banks outperforms the other cache models. This reveals that as application workloads become heavier with time, resulting in large cache miss rates and long bridge and interconnect delays, privatizing L2 and L3 caches may prove beneficial. Under less stressful workloads, sharing both L2 and L3 caches have the upper hand. This study confirms the desired configurability and flexibility of the cache memory's sharing degree based on the running workload.