A checkpointing strategy for scalable recovery on distributed parallel systems
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Providing Fault-Tolerance in Unreliable Grid Systems Through Adaptive Checkpointing and Replication
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Optimization of checkpointing-related I/O for high-performance parallel and distributed computing
The Journal of Supercomputing
Numerical computation algorithms for sequential checkpoint placement
Performance Evaluation
Job failures in high performance computing systems: A large-scale empirical study
Computers & Mathematics with Applications
Comparing checkpoint and rollback recovery schemes in a cluster system
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
A policy-based approach for strong mobility of composed Web services
Service Oriented Computing and Applications
Hi-index | 0.00 |