A fault-tolerant architecture for transportation information services of e-government
AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Journal of Systems Architecture: the EUROMICRO Journal
Rectifying orphan components using group-failover in distributed real-time and embedded systems
Proceedings of the 14th international ACM Sigsoft symposium on Component based software engineering
Towards reliable intelligent transportation systems for e-government
EGOVIS'11 Proceedings of the Second international conference on Electronic government and the information systems perspective
Autonomic resources management of CORBA based systems for transportation with an agent
ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
Proceedings of the 9th Middleware Doctoral Symposium of the 13th ACM/IFIP/USENIX International Middleware Conference
Benchmarking Peer-to-Peer Systems
Realizing a fault-tolerant embedded controller on distributed real-time systems
ACM SIGBED Review - Special Issue on the 5th Workshop on Adaptive and Reconfigurable Embedded Systems
Hi-index | 0.00 |
Supporting uninterrupted services for distributed soft real-time applications is hard in resource-constrained and dynamic environments, where processor or process failures and system workload changes are common. Fault-tolerant middleware for these applications must achieve high service availability and satisfactory response times for client applications. Although passive replication is a promising fault tolerance strategy for resource-constrained systems, conventional client failover approaches are non-adaptive and load-agnostic, which can cause system overloads and significantly increase response times after failure recovery.This paper presents four contributions to the study of passive replication for distributed soft real-time applications. First, it describes how our Fault-tolerant Load-aware and Adaptive middlewaRe (FLARe) dynamically adjusts failover targets at runtime in response to system load fluctuations and resource availability. Second, it describes how FLARe's overload management strategy proactively enforces desired CPU utilization bounds by redirecting clients from overloaded processors. Third, it presents the design and implementation of FLARe's lightweight middleware architecture that manages failures and overloads transparently to clients. Finally, it presents experimental results on a distributed Linux testbed that demonstrate how FLARe adaptively maintains soft real-time performance for clients operating in the presence of failures and overloads with negligible runtime overhead.