A quasi-synchronous checkpointing algorithm that prevents contention for stable storage

  • Authors:
  • D. Manivannan;Q. Jiang;Jianchang Yang;M. Singhal

  • Affiliations:
  • Department of Computer Science, University of Kentucky, Lexington, KY 40506, United States;Department of Computer Science, University of Kentucky, Lexington, KY 40506, United States;Department of Computer and Information Sciences, SUNY, Fredonia, Fredonia, NY 14063, United States;Department of Computer Science, University of Kentucky, Lexington, KY 40506, United States

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.07

Visualization

Abstract

Checkpointing and rollback recovery are established techniques for handling failures in distributed systems. Under synchronous checkpointing, each process involved in the distributed computation takes checkpoint almost simultaneously. This causes contention for network stable storage and hence degrades performance as processes may have to wait for long time for the checkpointing operation to complete. In this paper, we propose a staggered quasi-synchronous checkpointing algorithm which reduces contention for network stable storage without any synchronization overhead.