Adaptive Failover for Real-Time Middleware with Passive Replication

  • Authors:
  • Jaiganesh Balasubramanian;Sumant Tambe;Chenyang Lu;Aniruddha Gokhale;Christopher Gill;Douglas C. Schmidt

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • RTAS '09 Proceedings of the 2009 15th IEEE Symposium on Real-Time and Embedded Technology and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Supporting uninterrupted services for distributed soft real-time applications is hard in resource-constrained and dynamic environments, where processor or process failures and system workload changes are common. Fault-tolerant middleware for these applications must achieve high service availability and satisfactory response times for client applications. Although passive replication is a promising fault tolerance strategy for resource-constrained systems, conventional client failover approaches are non-adaptive and load-agnostic, which can cause system overloads and significantly increase response times after failure recovery.This paper presents four contributions to the study of passive replication for distributed soft real-time applications. First, it describes how our Fault-tolerant Load-aware and Adaptive middlewaRe (FLARe) dynamically adjusts failover targets at runtime in response to system load fluctuations and resource availability. Second, it describes how FLARe's overload management strategy proactively enforces desired CPU utilization bounds by redirecting clients from overloaded processors. Third, it presents the design and implementation of FLARe's lightweight middleware architecture that manages failures and overloads transparently to clients. Finally, it presents experimental results on a distributed Linux testbed that demonstrate how FLARe adaptively maintains soft real-time performance for clients operating in the presence of failures and overloads with negligible runtime overhead.