A new process migration algorithm
ACM SIGOPS Operating Systems Review
Distributed and parallel computing
Distributed and parallel computing
In search of clusters (2nd ed.)
In search of clusters (2nd ed.)
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
Parallel Program Model for Distributed Systems
Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Parallel programming with data driven model
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
TOPAS - Parallel Programming Environment for Distributed Computing
ICCS '02 Proceedings of the International Conference on Computational Science-Part I
FPGA based distributed self healing architecture for reusable systems
Cluster Computing
Optimizing decomposition of software architecture for local recovery
Software Quality Control
Hi-index | 0.00 |
This paper presents a solution for the problem of transparent recovery of asynchronous distributed computation on clusters of workstations when a fault occurs on a node. If the system has fault-tolerant features, it can survive the fault and continues its computations. Performance degradation is unavoidable when hardware redundancies are not available. It is a large advantage if the long-runtime application can restart from a checkpoint instead of restarting whole computation. This paper presents the fault-tolerant feature of the DDG environment oriented to cluster systems without hardware spare.