Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence

  • Authors:
  • Natalie D. Enright Jerger;Li-Shiuan Peh;Mikko H. Lipasti

  • Affiliations:
  • Dept of Electrical and Comp. Engineering, University of Wisconsin-Madison, 53706, USA;Dept of Electrical Engineering, Princeton University, NJ 08544, USA;Dept of Electrical and Comp. Engineering, University of Wisconsin-Madison, 53706, USA

  • Venue:
  • Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Scalable cache coherence solutions are imperative to drive the many-core revolution forward. To fully realize the massive computation power of these many-core architectures, the communication substrate must be carefully examined and streamlined. There is tension between the need for an ordered interconnect to simplify coherence and the need for an unordered interconnect to provide scalable communication. In this work, we propose a coherence protocol, Virtual Tree Coherence (VTC), that relies on a virtually ordered interconnect. Our virtual ordering can be overlaid on any unordered interconnect to provide scalable, high-bandwidth communication. Speci cally, VTC keeps track of sharers of a coarse-grained region, and multicasts requests to them through a virtual tree, employing properties of the virtual tree to enforce ordering amongst coherence requests. We compare VTC against a commonly used directory-based protocol and a greedy-order protocol extended onto an unordered interconnect. VTC outperforms both of these by averages of 25% and 11% in execution time respectively across a suite of scienti c and commercial applications on 16 cores. For a 64-core system running server consolidation workloads, VTC outperforms directory and greedy protocols with average runtime improvements of 31% and 12%.