NT-SwiFT: software implemented fault tolerance on windows NT

  • Authors:
  • Yennun Huang;P. Emerald Chung;Chandra Kintala;Chung-Yih Wang;De-Ron Liang

  • Affiliations:
  • Bell Laboratories, Lucent Technologies, Inc., Murray Hill, NJ;Bell Laboratories, Lucent Technologies, Inc., Murray Hill, NJ;Bell Laboratories, Lucent Technologies, Inc., Murray Hill, NJ;Institute of Information Science, Academia Sinica, Taipei, Taiwan, ROC;Institute of Information Science, Academia Sinica, Taipei, Taiwan, ROC

  • Venue:
  • WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

More and more high available applications are implemented on Windows NT. However, the current version of Windows NT (NT4) does not provide some facilities that are needed to implement these fault tolerant applications. In this paper, we describe a set of components collectively named NT-SwiFT (Software Implemented Fault Tolerance) which facilitates building fault-tolerant and highly available applications on Windows NT. NT-SwiFT provides components for automatic error detection and recovery, checkpointing, event logging and replay, communication error recovery, incremental data replications, IP packets re-routing, etc. SwiFT components were originally designed on UNIX. The UNIX version was first ported to NT to run on UWIN [Korn97]. Gradually a large portion of the software has been re-implemented to take advantage of native NT system services. This paper describes these components and compares the differences in the UNIX and NT implementations. We also describe some applications using these components and discuss how to leverage NT system services and cope with some missing features.