Towards an efficient single system image cluster operating system
Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
A comparative study at the logical level of centralised and distributed recovery in clusters
ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
Hi-index | 0.00 |
This paper describes issues in the design and implementation of checkpointing and recovery modules for the Kerrighed DSM cluster system. Our design is for a DSM supporting the sequential consistency model. The mechanisms are general enough to be used in a number of differentcheckpointing and recovery protocols. It is designed to support common optimizations for performance suggested inliterature, while staying light-weight during fault-free execution. We also present preliminary performance results ofthe current implementation.