Non-Strict Cache Coherence: Exploiting Data-Race Tolerance in Emerging Applications

  • Authors:
  • Siddhartha V. Tambat;Sriram Vajapeyam

  • Affiliations:
  • -;-

  • Venue:
  • ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software distributed shared memory (DSM) platforms on networks of workstations tolerate large network latencies by employing one of several weak memory consistency models. Data-race tolerant applications, such as Genetic Algorithms (GAs), Probabilistic Inference, etc., offer an additional degree of freedom to tolerate network latency: they do not synchronize shared memory references, and behave correctly when supplied outdated shared data. However, these algorithms often have a high communication-to-computation ratio and can flood the network with messages in the presence of large message delays. We study the performance of controlled asynchronous implementations of these algorithms via the use of our previously proposed blocking Global Read memory access primitive. Global Read implements non-strict cache coherence by guaranteeing to return to the reader a shared datum value from within a specified staleness range. Experiments on an IBM SP2 multicomputer with an Ethernet show significant performance improvements for controlled asynchronous implementations. On a lightly loaded Ethernet network, most of the GA benchmarks see 30% to 40% improvement over the best competitor for 2 to 16 processors, while two of the Probabilistic Inference benchmarks see more than 80% improvement for two processors. As the network, load in-creases, the benefits of non-strict cache coherence increase significantly.