Transparent checkpointing and rollback recovery mechanism for Windows NT applications

  • Authors:
  • Youhui Zhang;Dongsheng Wang;Weimin Zheng

  • Affiliations:
  • -;-;-

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clusters of industry-standard computers running Windows NT are emerging as a competitive alternative for large-scale parallel computing. However, clusters have increased susceptibility to failure especially when they contain many nodes. Therefore it is necessary to implement high availability on Windows NT. This paper introduces the Checkpoint and Rollback Recovery (CRR) mechanism on Windows NT and presents WinNTCkpt, a Checkpointing and recovery tool implemented by us. WinNTCkpt can be used to transparently checkpoint and recover applications running on Windows NT. To use this tool, user is only required to provide the application's executable code instead of source code. WinNTCkpt also provides a set of CRR APIs for users to construct applications with high availability. Owing to API interception and thread injection mechanisms, WinNTCkpt endows some existing applications with CRR functions. WinNTCkpt has been proved valid for Windows NT 4.0 and Windows 2000 applications.