A Two-Level Checkpoint Algorithm in a Highly-Available Parallel Single Level Store System

  • Authors:
  • Christine Morin;Renaud Lottiaux;Anne-Marie Kermarrec

  • Affiliations:
  • -;-;-

  • Venue:
  • CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

A Parallel Single Level Store systems (PSLS) integrates a shared virtual memory and a parallel file system. Managing globally the data, they provide programmers of scientific applications with the attractive shared memory programming model combined with a large and efficient file system in a cluster. In this paper, we present a cheap and efficient two-level checkpointing approach enabling a PSLS to tolerate failures. The first level checkpointing algorithm is very efficient and saves data in memory but requires a large amount of memory space. When memories are saturated, an alternative algorithm, saving a checkpoint on disks is implemented. Performance results present the impact of different variants of the checkpointing algorithms.