Low-Cost Error Containment and Recovery for Onboard Guarded Software Upgrading and Beyond

  • Authors:
  • Ann T. Tai;Kam S. Tso;Leon Alkalai;Savio N. Chau;William H. Sanders

  • Affiliations:
  • IA Tech, Inc., Los Angeles, CA;IA Tech, Inc., Los Angeles, CA;Jet propusion Lab., Pasadena, CA;Jet propusion Lab., Pasadena, CA;Univ. of Illinois at Urbana- Champaign, Urbana

  • Venue:
  • IEEE Transactions on Computers - Special issue on fault-tolerant embedded systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Message-driven confidence-driven (MDCD) error containment and recovery, a low-cost approach to mitigating the effect of software design faults in distributed embedded systems, is developed for onboard guarded software upgrading for deep-space missions. In this paper, we first describe and verify the MDCD algorithms in which we introduce the notion of "confidence-driven" to complement the "communication-induced" approach employed by a number of existing checkpointing protocols to achieve error containment and recovery efficiency. We then conduct a model-based analysis to show that the algorithms ensure low performance overhead. Finally, we discuss the advantages of the MDCD approach and its potential utility as a general-purpose, low-cost software fault tolerance technique for distributed embedded computing.