Towards middleware for fault-tolerance in distributed real-time and embedded systems

  • Authors:
  • Jaiganesh Balasubramanian;Aniruddha Gokhale;Douglas C. Schmidt;Nanbor Wang

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN;Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN;Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN;Tech-X Corporation, Boulder, CO

  • Venue:
  • DAIS'08 Proceedings of the 8th IFIP WG 6.1 international conference on Distributed applications and interoperable systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed real-time and embedded (DRE) systems often require support for multiple simultaneous quality of service (QoS) properties, such as real-timeliness and fault tolerance, that operate within resource constrained environments. These resource constraints motivate the need for a lightweight middleware infrastructure, while the need for simultaneous QoS properties require the middleware to provide fault tolerance capabilities that respect time-critical needs of DRE systems. Conventional middleware solutions, such as Fault-tolerant CORBA (FTCORBA) and Continuous Availability API for J2EE, have limited utility for DRE systems because they are heavyweight (e.g., the complexity of their feature-rich fault tolerance capabilities consumes excessive runtime resources), yet incomplete (e.g., they lack mechanisms that enable fault tolerance while maintaining real-time predictability). This paper provides three contributions to the development and standardization of lightweight real-time and fault-tolerant middleware for DRE systems. First, we discuss the challenges in realizing real-time fault-tolerant solutions for DRE systems using contemporary middleware. Second, we describe recent progress towards standardizing a CORBA lightweight fault-tolerance specification for DRE systems. Third, we present the architecture of FLARe, which is a prototype based on the OMG real-time fault-tolerant CORBA middleware standardization efforts that is lightweight (e.g., leverages only those server- and client-side mechanisms required for real-time systems) and predictable (e.g., provides fault-tolerant mechanisms that respect time-critical performance needs of DRE systems).