Automated application-level checkpointing of MPI programs
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Checkpointing for Peta-Scale Systems: A Look into the Future of Practical Rollback-Recovery
IEEE Transactions on Dependable and Secure Computing
The Fault Tolerant Parallel Algorithm: the Parallel Recomputing Based Failure Recovery
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Hi-index | 0.00 |
This paper proposes an optimization method of data saving for application-level checkpointing based on the live-variable analysis method for MPI programs. We presents the implementation of a source-to-source precompiler (CAC) for automating applicationlevel checkpointing based on the optimization method. The experiment shows that CAC is capable of automating application-level checkpointing correctly and reducing checkpoint data effectively.