Rectifying orphan components using group-failover in distributed real-time and embedded systems

  • Authors:
  • Sumant Tambe;Aniruddha Gokhale

  • Affiliations:
  • Vanderbilt University, Nashville, TN, USA;Vanderbilt University, Nashville, TN, Iceland

  • Venue:
  • Proceedings of the 14th international ACM Sigsoft symposium on Component based software engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Orphan requests are a significant problem for multi-tier distributed systems since they adversely impact system correctness by violating the exactly-once semantics of applications and may waste resources. Orphan requests stem from the failure(s) of non-deterministic components involved in nested invocations of replicated components. Resolving this problem in the context of resource constrained, component-based, distributed real-time and embedded (DRE) systems that form end-to-end task chains is challenging because conventional transaction-based solutions cannot assure real-time properties of the DRE applications. To address these challenges, this paper presents a group-failover protocol that comprises three key capabilities: real-time failure detection and client failover, timely mitigation of orphan requests, and two novel application state consistency strategies to ensure the correctness of DRE systems by maintaining the exactly-once semantics even during failures. Our solution is implemented in the context of the CIAO real-time CORBA Component Model middleware. Empirical evaluations of the group-failover protocol in both fault-free and failure recovery scenarios for DRE task chains of different sizes demonstratesa low overhead and predictable performance.