An Object-Oriented Testbed for the Evaluation of Checkpointing and Recovery Systems

  • Authors:
  • B. Ramamurthy;S. J. Upadhyaya;R. K. Iyer

  • Affiliations:
  • -;-;-

  • Venue:
  • FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the design and development of an object-oriented testbed for simulation and analysis of checkpointing and recovery schemes in distributed systems. An important contribution of the testbed is a unified environment that provides a set of specialized components for easy and detailed simulation of checkpointing and recovery schemes. The testbed allows a designer to mix and match different components either to study the effectiveness of a particular scheme or to freely experiment with hybrid designs before the actual implementation. The testbed also facilitates the evaluation of interdependencies among the various parameters such as communication and application dynamics and their effect on the performance of checkpointing and recovery schemes. The implementation of the testbed as an extension of DEPEND which is an integrated design and fault-injection environment, provides for unique system-level dependability analysis under realistic fault conditions unlike existing simulation tools. We illustrate the versatility of the testbed by using four diverse applications, ranging from the comparison of performances of two checkpointing and recovery schemes to the study of the effect of checkpoint size.