A New Approach for High Performance Computing Systems with Various Checkpointing Schemes

  • Authors:
  • Gyung-Leen Park;Hee Youn Yong

  • Affiliations:
  • Department of Computer Science and Statistics, Cheju National University, Cheju, Korea;School of Information and Communications Engineering, Sungkyunkwan University, Suwon, Korea

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Roll-forward recovery schemes were proposed to enhance the performance of fault tolerant systems employing checkpointing approach. In the roll-forward schemes, multiple processors are used for simultaneous roll-forward and validation processing. This paper proposes the sample comparison approach along with the checkpointing, which further improves the performance by reducing the overhead imposed by the checkpointing. We also develop general analytical models for estimating the availability, which are applicable for any checkpointing scheme. Performance comparisons reveal that the availabilities of the checkpointing schemes with sample comparison are higher than those of the schemes without it, while the required checkpoint interval is larger.