A Failure Detection System for Large Scale Distributed Systems
International Journal of Distributed Systems and Technologies
Hi-index | 0.00 |
In this paper we present a solution to ensuring a high degree of availability and reliability in service-based large scale distributed systems. The proposed architecture is based on a set of replicated services running in a fault-tolerant container and a proxy service able to mask possible faults, completely transparent for a client. The solution not only masks possible faults but also optimizes the access to the distributed services and their replicas using a load-balancing strategy, whilst ensuring a high degree of scalability. The advantages of the proposed architecture were evaluated using a pilot implementation. The obtained results prove that the solution ensures a high degree of availability and reliability for a wide range of service-based distributed systems.