Performance of PVM with the MOSIX preemptive process migration scheme
ICCSSE '96 Proceedings of the 7th Israeli Conference on Computer-Based Systems and Software Engineering
Libckpt: transparent checkpointing under Unix
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
NT-SwiFT: software implemented fault tolerance on Windows NT
Journal of Systems and Software
A feather-weight virtual machine for windows applications
Proceedings of the 2nd international conference on Virtual execution environments
Efficient user-level thread migration and checkpointing on windows NT clusters
WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
High-end workstation compute farms using windows NT
WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
SPARC: a security and privacy aware virtual machinecheckpointing mechanism
Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
Hi-index | 0.00 |
With the increased use of networks of NT workstations for long-running engineering applications, process checkpointing and process migration can avoid wasted computer cycles and improve system utilization. The problem we solve is how to capture and reconstruct process state transparently and efficiently without affecting the correctness of the application. A checkpoint facility enables the intermediate state of a process to be saved to a file. Users can later resume execution of the process from the checkpoint file. This prevents the loss of data generated by long-running processes due to program or system failures, and it also facilitates debugging when the bug appears after the program has executed for a long time. This paper describes the implementation of a checkpoint library that permits users to save temporary state of long-running multi-threaded programs on a Windows/NT system and to resume execution from the checkpointed state at a later time. Our Windows implementation is the first such implementations that we are aware of for this operating system. Our implementation is portable, maintains good performance, and is transparent. The checkpoint facility is currently used in several major internal projects at Intel.