F. Zambonelli
-
Staggered Consistent Checkpointing
IEEE Transactions on Parallel and Distributed Systems
ROS: the rollback-one-step method to minimize the waiting time during debugging long-running parallel programs
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science