Autonomous FPGA Fault Handling through Competitive Runtime Reconfiguration

  • Authors:
  • Kening Zhang

  • Affiliations:
  • University of Central Florida

  • Venue:
  • EH '05 Proceedings of the 2005 NASA/DoD Conference on Evolvable Hardware
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

An autonomous self-repair approach for SRAM-based FPGAs is developed based on Competitive Runtime Reconfiguration (CRR). Under the CRR technique, an initial population of functionally identical (same input-output behavior), yet physically distinct (alternative design or place-and-route realization) FPGA configurations is produced at design time. At run-time, these individuals compete for selection based on a fitness function favoring fault-free behavior. Hence, any physical resource exhibiting an operationally-significant fault decreases the fitness of those configurations which use it. Through runtime competition, the presence of the fault becomes occluded from the visibility of subsequent FPGA operations. Meanwhile, the offspring formed through crossover and mutation of faulty and viable configurations are reintroduced into the population. This enables evolution of a customized fault-specific repair, realized directly as new configurations using the FPGAýs normal throughput processing operations. Multiple phases of the fault handling process including Detection, Isolation, Diagnosis, and Recovery are integrated into a single cohesive approach. FPGA-based multipliers are examined as a case study demonstrating evolution of a complete repair for a 3-bit x 3-bit multiplier from several stuck-atfaults within a few thousand iterations. Repairs are evolved in-situ, in real-time, without test vectors, while allowing the FPGA to remain partially online.