Adaptive Failover for Real-Time Middleware with Passive Replication

Authors:
Jaiganesh Balasubramanian;Sumant Tambe;Chenyang Lu;Aniruddha Gokhale;Christopher Gill;Douglas C. Schmidt
Affiliations:
-;-;-;-;-;-
Venue:
RTAS '09 Proceedings of the 2009 15th IEEE Symposium on Real-Time and Embedded Technology and Applications
Year:
2009

Citing 0
Cited 8

A fault-tolerant architecture for transportation information services of e-government

AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Supporting component-based failover units in middleware for distributed real-time and embedded systems

Journal of Systems Architecture: the EUROMICRO Journal
Rectifying orphan components using group-failover in distributed real-time and embedded systems

Proceedings of the 14th international ACM Sigsoft symposium on Component based software engineering
Towards reliable intelligent transportation systems for e-government

EGOVIS'11 Proceedings of the Second international conference on Electronic government and the information systems perspective
Autonomic resources management of CORBA based systems for transportation with an agent

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
Resource management and fault tolerance principles for supporting distributed real-time and embedded systems in the cloud

Proceedings of the 9th Middleware Doctoral Symposium of the 13th ACM/IFIP/USENIX International Middleware Conference
Service overlays

Benchmarking Peer-to-Peer Systems
Realizing a fault-tolerant embedded controller on distributed real-time systems

ACM SIGBED Review - Special Issue on the 5th Workshop on Adaptive and Reconfigurable Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Supporting uninterrupted services for distributed soft real-time applications is hard in resource-constrained and dynamic environments, where processor or process failures and system workload changes are common. Fault-tolerant middleware for these applications must achieve high service availability and satisfactory response times for client applications. Although passive replication is a promising fault tolerance strategy for resource-constrained systems, conventional client failover approaches are non-adaptive and load-agnostic, which can cause system overloads and significantly increase response times after failure recovery.This paper presents four contributions to the study of passive replication for distributed soft real-time applications. First, it describes how our Fault-tolerant Load-aware and Adaptive middlewaRe (FLARe) dynamically adjusts failover targets at runtime in response to system load fluctuations and resource availability. Second, it describes how FLARe's overload management strategy proactively enforces desired CPU utilization bounds by redirecting clients from overloaded processors. Third, it presents the design and implementation of FLARe's lightweight middleware architecture that manages failures and overloads transparently to clients. Finally, it presents experimental results on a distributed Linux testbed that demonstrate how FLARe adaptively maintains soft real-time performance for clients operating in the presence of failures and overloads with negligible runtime overhead.