Atomic Coherence: Leveraging nanophotonics to build race-free cache coherence protocols

Authors:
Dana Vantrease;Mikko H. Lipasti;Nathan Binkert
Affiliations:
Univ of Wisconsin - Madison, Madison, WI;Univ of Wisconsin - Madison, Madison, WI;HP Labs, Palo Alto, CA
Venue:
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Year:
2011

Citing 0
Cited 6

A composite and scalable cache coherence protocol for large scale CMPs

Proceedings of the international conference on Supercomputing
Channel borrowing: an energy-efficient nanophotonic crossbar architecture with light-weight arbitration

Proceedings of the 26th ACM international conference on Supercomputing
Tolerating process variations in nanophotonic on-chip networks

Proceedings of the 39th Annual International Symposium on Computer Architecture
Complexity-effective multicore coherence

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
LIGERO: A light but efficient router conceived for cache-coherent chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Protozoa: adaptive granularity cache coherence

Proceedings of the 40th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper advocates Atomic Coherence, a framework that simplifies cache coherence protocol specification, design, and verification by decoupling races from the protocol's operation. Atomic Coherence requires conflicting coherence requests to the same addresses be serialized with a mutex before they are issued. Once issued, requests follow a predictable race-free path. Because requests are guaranteed not to race, coherence protocols are simpler and protocol extensions are straightforward. Our implementation of Atomic Coherence uses optical mutexes because optics provides very low latency. We begin with a state-of-the-art non-atomic MOEFSI protocol and demonstrate that an atomic implementation is much simpler while imposing less than a 2% performance penalty. We then show how, in the absence of races, it is easy to add support for speculative coherence and improve performance by up to 70%. Similar performance gains may be possible in a non-atomic protocol, but not without considerable effort in race management.