Effects of the checkpoint interval on time and space in time warp

  • Authors:
  • Bruno R. Preiss;Wayne M. Loucks;Ian D. Macintyre

  • Affiliations:
  • Univ. of Waterloo, Waterloo, Ont., Canada;Univ. of Waterloo, Waterloo, Ont., Canada;Univ. of Waterloo, Waterloo, Ont., Canada

  • Venue:
  • ACM Transactions on Modeling and Computer Simulation (TOMACS)
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

Optimistically synchronized parallel discrete-event simulation is based on the use of communicating sequential processes. Optimistic synchronization means that the processes proceed under the assumption that a synchronized execution schedule is fortuitous. Periodic checkpointing of the state of a process allows the process to roll back to an earlier state when synchronization errors are detected. This article examines the effects of varying the checkpoint interval on the execution time and memory space needed to perform a parallel simulation.The empirical results presented in this article were obtained from the simulation of closed stochastic queuing networks with several different topologies. Various intraprocessor process-scheduling algorithms and both lazy and aggressive cancellation strategies are considered. The empirical results are compared with analytical formulae predicting time-optimal checkpoint intervals. Two modes of operation, throttling and thrashing, have been noted and their effect examined. As the checkpoint interval is increased from one, there is a throttling effect among processes on the same processor, which improves performance. When the checkpoint interval is made too large, there is a thrashing effect caused by interaction between processes on different processors. It is shown that the time-optimal and space-optimal checkpoint intervals are not the same. Furthermore, a checkpoint interval that is too small affects space adversely more than time, whereas, a checkpoint interval that is too large affects time adversely more than space.