An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation

Authors:
Hussam M. Soliman;Adel Said Elmaghraby
Affiliations:
King Saud Univ., Saudi, Arabia;Univ. of Louisville, Louisville, KY
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1998

Citing 10
Cited 14

Time warp on a shared memory multiprocessor

Transactions of the Society for Computer Simulation International
Selecting the checkpoint interval in time warp simulation

PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
An analytical comparison of periodic checkpointing and incremental state saving

PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
Adaptive checkpointing in Time Warp

PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Effects of the checkpoint interval on time and space in time warp

ACM Transactions on Modeling and Computer Simulation (TOMACS)
The treatment of state in optimistic systems

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
Comparative analysis of periodic state saving techniques in time warp simulators

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
Transparent incremental state saving in time warp parallel discrete event simulation

PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
State saving for interactive optimistic simulation

Proceedings of the eleventh workshop on Parallel and distributed simulation
A comparative study of state saving mechanisms for time warp synchronized parallel discrete event simulation

SS '96 Proceedings of the 29th Annual Simulation Symposium (SS '96)

Semi-asynchronous checkpointing for optimistic simulation on a Myrinet based NOW

Proceedings of the fifteenth workshop on Parallel and distributed simulation
A Cost Model for Selecting Checkpoint Positions in Time Warp Parallel Simulation

IEEE Transactions on Parallel and Distributed Systems
A Roll-Forward Recovery Scheme for Solving the Problem of Coasting Forward for Distributed Systems

ACM SIGOPS Operating Systems Review
Modeling and optimization of non-blocking checkpointing for optimistic simulation on myrinet clusters

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Short note: Modeling and optimization of non-blocking checkpointing for optimistic simulation on myrinet clusters

Journal of Parallel and Distributed Computing
A Version of MASM Portable Across Different UNIX Systems and Different Hardware Architectures

DS-RT '05 Proceedings of the 9th IEEE International Symposium on Distributed Simulation and Real-Time Applications
Transparent State Management for Optimistic Synchronization in the High Level Architecture

Simulation
Multiprogrammed non-blocking checkpoints in support of optimistic simulation on myrinet clusters

Journal of Systems Architecture: the EUROMICRO Journal
Numerical computation algorithms for sequential checkpoint placement

Performance Evaluation
Di-DyMeLoR: Logging only Dirty Chunks for Efficient Management of Dynamic Memory Based Optimistic Simulation Objects

PADS '09 Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation
Benchmarking Memory Management Capabilities within ROOT-Sim

DS-RT '09 Proceedings of the 2009 13th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications
Time-parallel simulation of wireless ad hoc networks

Wireless Networks
Enhancing the performance of HLA-based simulation systems via software diversity and active replication

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
An evolutionary algorithm to optimize log/restore operations within optimistic simulation platforms

Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Time Warp distributed simulation algorithm uses checkpointing to save process states after certain event executions for later recovery at the time of a rollback. Two main techniques have been used for checkpointing: periodic state saving and incremental state saving. The former technique introduces large overheads in reconstructing a desired state by coasting forward from an earlier checkpointed state if the computational granularity is large. The latter technique also has large overheads in applications with large rollback distances. A hybrid checkpointing technique is proposed which uses both periodic and incremental state saving simultaneously in such a way that it reduces checkpointing time overheads. A detailed analytical model is developed for the hybrid technique, and comparisons are made using similar analytical models with periodic and incremental state saving techniques. Results show that when the system parameters are chosen to represent large and complex simulated systems, the hybrid approach has less checkpointing time overhead than the other two techniques.