Minimalist Recovery Techniques for Single Event Effects in Spaceborne Microcontrollers
DCCA '99 Proceedings of the conference on Dependable Computing for Critical Applications
Hi-index | 0.00 |
Abstract: This paper describes the use of fault-tolerance in a microcontroller node to be used in a network of embedded processors. It is primarily motivated by long-life space applications where radiation-induced transient errors will be a frequent occurrence, and a few chip failures may be expected before a mission is completed. A testbed has been constructed, and a real-time executive has been developed and tested in it. Preliminary fault-insertion testing has been started. Due to interconnection constraints for latchup circumvention and other reasons, we have chosen a design that is not Byzantine resilient. Even though inconsistent signaling may occur occasionally, multiple recovery actions must converge to a successful testing and restart of the system to regain correct functionality.