Scalable directory architecture for distributed shared memory chip multiprocessors

  • Authors:
  • Huan Fang;Mats Brorsson

  • Affiliations:
  • KTH School of Information and Communication Technology;KTH School of Information and Communication Technology

  • Venue:
  • ACM SIGARCH Computer Architecture News
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional Directory-based cache coherence protocol is far from optimal for large-scale cache coherent shared memory multiprocessors due to the increasing latency to access directories stored in DRAM memory. Instead of keeping directories in main memory, we consider distributing the directory together with L2 cache across all nodes on a Chip Multiprocessor. Each node contains a processing unit, a private L1 cache, a slice of L2 cache, memory controller and a router. Both L2 cache and memories are distributed shared and interleaved by a subset of memory address bits. All nodes are interconnected through a low latency two dimensional Mesh network. Directory, being a split component to L2 cache, only stores sharing information for blocks while L2 cache stores only data blocks exclusive with L1 cache. Shared L2 cache can increase total effective cache capacity on chip, but also increase the miss latency when data is on a remote node. Being different from Directory Cache structure, our proposal totally removes the directory from memory, which saves memory space and reduces access latency. Compared to L2 cache that combines directory information internally, our L2 cache structure saves up to 88% cache space and achieves similar performance.