Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
The weakest failure detector for solving consensus
Journal of the ACM (JACM)
Failure Detection and Randomization: A Hybrid Approach to Solve Consensus
SIAM Journal on Computing
Theoretical Computer Science
DOORS: Towards High-Performance Fault Tolerant CORBA
DOA '00 Proceedings of the International Symposium on Distributed Objects and Applications
Failure Detection and Consensus in the Crash-Recovery Model
Failure Detection and Consensus in the Crash-Recovery Model
ADAPTATION - Algorithms to ADAPTive FAulT MonItOriNg and Their Implementation on CORBA
DOA '01 Proceedings of the Third International Symposium on Distributed Objects and Applications
Hi-index | 0.00 |
A number of different kinds of applications developed on CORBA framework need fault tolerance in asynchronous distributed system or network environment, and it is important to quickly detect the faults. There exist various fault monitoring and detection algorithms that employ a timeout-based mechanism. However, they are occasionally inaccurate in unstable or overloaded system. The goal of the proposed algorithm is to enhance the accuracy of fault monitoring. This is achieved by promptly adjusting the timeout interval using the past elapsed time values accumulated. Additionally, we use asynchronous invocation to call ‘is_alive()' method of monitorable object with a sequence number. Experiment on CORBA-compliant Orbix ORB confirms the effectiveness of the proposed scheme compared to the existing one.