A scalable HTTP server: the NCSA prototype
Selected papers of the first conference on World-Wide Web
Replication and fault-tolerance in the ISIS system
Proceedings of the tenth ACM symposium on Operating systems principles
An Adaptive Algorithm for Tolerating Value Faults and Crash Failures
IEEE Transactions on Parallel and Distributed Systems
ROAFTS: A Middleware Architecture for Real-Time Object-Oriented Adaptive Fault Tolerance Support
HASE '98 The 3rd IEEE International Symposium on High-Assurance Systems Engineering
Replicating CORBA objects: a marriage between active and passive replication
Proceedings of the IFIP WG 6.1 International Working Conference on Distributed Applications and Interoperable Systems II
Proteus: A Flexible Infrastructure to Implement Adaptive Fault Tolerance in AQuA
DCCA '99 Proceedings of the conference on Dependable Computing for Critical Applications
A Fault Tolerance Framework for CORBA
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects
SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks
Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks
The ensemble system
Building reliable interoperable distributed objects with the maestro tools
Building reliable interoperable distributed objects with the maestro tools
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Data and code integrity in Grid environments
SMO'06 Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization
Generalization of the fast consistency algorithm to a grid with multiple high demand zones
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartII
Journal of Systems Architecture: the EUROMICRO Journal
Embedded Systems Design
Adaptare: Supporting automatic and dependable adaptation in dynamic environments
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Hi-index | 0.00 |
Abstract: Server replication is commonly used to improve the fault tolerance and response time of distributed services. An important problem when executing time-critical applications in a replicated environment is that of preventing timing failures by dynamically selecting the replicas that can satisfy a client's timing requirement, even when the quality of service is degraded due to replica failures and excess load on the server. In this paper, we describe the approach we have used to solve this problem in AQuA, a CORBA-based middleware that transparently replicates objects across a local area network. The approach we use estimates a replica's response time distribution based on performance measurements regularly broadcast by the replica. An online model uses these measurements to predict the probability with which a replica can prevent a timing failure for a client. A selection algorithm then uses this prediction to choose a subset of replicas that can together meet the client's timing constraints with at least the probability requested by the client. We conclude with experimental results based on our implementation.