Coherence-Centric Logging and Recovery for Home-Based Software Distributed Shared Memory

  • Authors:
  • Angkul Kongmunvattana;Nian-Feng Tzeng

  • Affiliations:
  • -;-

  • Venue:
  • ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

The probability of failures in software distributed shared memory (SDSM) increases as the system size grows. This paper introduces a new, efficient message logging technique, called the coherence-centric logging (CCL) and recovery protocol, for home-based SDSM. Our CCL minimizes failure-free overhead by logging only data necessary for correct recovery and tolerates high disk access latency by overlapping disk accesses with coherence-induced communication existing in home-based SDSM, while our recovery reduces the recovery time by prefetching data according to the future shared memory access patterns, thus eliminating the memory miss idle penalty during the recovery process. To the best of our knowledge, this is the very first work that considers crash recovery in home-based SDSM.We have performed experiments on a cluster of eight SUN Ultra-5 workstations, comparing our CCL against traditional message logging (ML) by modifying Tread Marks, a state-of-the-art SDSM system, to support the home-based protocol and then implementing both our CCL and the ML protocols in it. The experimental results show that our CCL protocol consistently outperforms the ML protocol: Our protocol increases the execution time negligibly, by merely 1% to 6%, during failure-free execution, while the ML protocol results in the execution time overhead of 9% to 24% due to its large log size and high disk access latency. Our recovery protocol improves the crash recovery speed by 55% to 84% when compared to re-execution, and it outperforms ML-recovery by a noticeable margin, ranging from 5% to 18% under parallel applications examined.