High Availability of the Memory Hierarchy in a Cluster

  • Authors:
  • Christine Morin;Renaud Lottiaux;Anne-Marie Kermarrec

  • Affiliations:
  • -;-;-

  • Venue:
  • SRDS '00 Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

A Single Level Store (SLS) integrating a Shared Virtual Memory and a Parallel File System with file mapping as interface is attractive for the execution of high performance applications in a cluster. However, the probability of a node reboot or failure is quite high. In this paper, we present the design of a highly available SLS system. Our approach combines checkpointing in memory and permanent check pointing on disk in a cluster using all cluster memory and disk resources. Preliminary performance results show the applicability of the proposed approach for parallel applications with huge input/output requirements.