Toward predictable, efficient, system-level tolerance of transient faults

  • Authors:
  • Jiguo Song;Gabriel Parmer

  • Affiliations:
  • The George Washington University, Washington, DC;The George Washington University, Washington, DC

  • Venue:
  • ACM SIGBED Review - Special Issue on the 5th Workshop on Adaptive and Reconfigurable Embedded Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As embedded and real-time systems increase in complexity, and as chip process technologies continually decrease feature size, transient faults increasingly threaten system failure. This paper introduces C3, an system to tolerate system-level faults (e.g. in the scheduler). When considering predictable recovery of system-level components, we introduce recovery interference, a side-effect of system-level recovery that causes possibly unbounded priority inversion. We discuss an interface-driven recovery technique that is effective, efficient, and uses on-demand recovery to avoid recovery interference.