SPACE: sharing pattern-based directory coherence for multicore scalability

  • Authors:
  • Hongzhou Zhao;Arrvindh Shriraman;Sandhya Dwarkadas

  • Affiliations:
  • University of Rochester, Rochester, NY, USA;University of Rochester, Rochester, NY, USA;University of Rochester, Rochester, NY, USA

  • Venue:
  • Proceedings of the 19th international conference on Parallel architectures and compilation techniques
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important challenge in multicore processors is the maintenance of cache coherence in a scalable manner. Directory-based protocols save bandwidth and achieve scalability by associating information about sharer cores with every cache block. As the number of cores and cache sizes increase, the directory itself adds significant area and energy overheads. In this paper, we propose SPACE, a directory design based on recognizing and representing the subset of sharing patterns present in an application. SPACE takes advantage of the observation that many memory locations in an application are accessed by the same set of processors, resulting in a few sharing patterns that occur frequently. The sharing pattern of a cache block is the bit vector representing the processors that share the block. SPACE decouples the sharing pattern from each cache block and holds them in a separate directory table. Multiple cache lines that have the same sharing pattern point to a common entry in the directory table. In addition, when the table capacity is exceeded, patterns that are similar to each other are dynamically collated into a single entry. Our results show that overall, SPACE is within 2% of the performance of a conventional directory. When compared to coarse vector directories, dynamically collating similar patterns eliminates more false sharers. Our experimentation also reveals that a small directory table (256-512 entries) can handle the access patterns in many applications, with the SPACE directory table size being O(P) and requiring a pointer per cache line whose size is O(log2P). Specifically, SPACE requires E44% of the area of a conventional directory at 16 processors and 25% at 32 processors.