Online fault tolerance for FPGA logic blocks

Authors:
John M. Emmert;Charles E. Stroud;Miron Abramovici
Affiliations:
Department of Electrical Engineering, Wright State University, Dayton, OH;Department of Electrical and Computer Engineering, Auburn University, AL;Design Automation for Flexible Chip Architectures, Framingham, MA
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2007

Citing 18
Cited 7

Tight bounds for minimax grid matching, with applications to the average case analysis of algorithms

STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
REMOD: a new methodology for designing fault-tolerant arithmetic circuits

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Methodologies for Tolerating Cell and Interconnect Faults in FPGAs

IEEE Transactions on Computers
Low overhead fault-tolerant FPGA systems

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On-line fault detection for bus-based field programmable gate arrays

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Tolerating operational faults in cluster-based FPGAs

FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
A Fault Tolerant Technique for FPGAs

Journal of Electronic Testing: Theory and Applications
Using embedded FPGAs for SoC yield improvement

Proceedings of the 39th annual Design Automation Conference
Defect Tolerant SRAM Based FPGAs

ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
Algorithms for Efficient Runtime Fault Recovery on Diverse FPGA Architectures

DFT '99 Proceedings of the 14th International Symposium on Defect and Fault-Tolerance in VLSI Systems
Partial reconfiguration of FPGA mapped designs with applications to fault tolerance and yield enhancement

FPL '97 Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications
Dynamic Fault Tolerance in FPGAs via Partial Reconfiguration

FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
On the Necessity of On-line-BIST in Safety-Critical Applications - A Case-Study

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
BIST-Based Diagnosis of FPGA Interconnect

ITC '02 Proceedings of the 2002 IEEE International Test Conference
Roving Stars: An Integrated Approach To On-Line Testing, Diagnosis, And Fault Tolerance For Fpgas In Adaptive Computing Systems

EH '01 Proceedings of the The 3rd NASA/DoD Workshop on Evolvable Hardware
Using Roving STARs for On-Line Testing and Diagnosis of FPGAs in Fault-Tolerant Applications

ITC '99 Proceedings of the 1999 IEEE International Test Conference
A survey of fault tolerant methodologies for FPGAs

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Online BIST and BIST-based diagnosis of FPGA logic blocks

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

System-on-Chip Test Architectures: Nanometer Design for Testability

System-on-Chip Test Architectures: Nanometer Design for Testability
Fault tolerant techniques for reconfigurable platforms

Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Progress in autonomous fault recovery of field programmable gate arrays

ACM Computing Surveys (CSUR)
Heuristic search for adaptive, defect-tolerant multiprocessor arrays

ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
The survivability of design-specific spare placement in FPGA architectures with high defect rates

ACM Transactions on Design Automation of Electronic Systems (TODAES)
A hierarchical self-repairing architecture for fast fault recovery of digital systems inspired from paralogous gene regulatory circuits

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
FPGA fault tolerant arithmetic logic: a case study using parallel-prefix adders

VLSI Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most adaptive computing systems use reconfigurable hardware in the form of field programmable gate arrays (FPGAs). For these systems to be fielded in harsh environments where high reliability and availability are a must, the applications running on the FPGAs must tolerate hardware faults that may occur during the lifetime of the system. In this paper, we present new fault-tolerant techniques for FPGA logic blocks, developed as part of the roving self-test areas (STARs) approach to online testing, diagnosis, and reconfiguration [1]. Our techniques can handle large numbers of faults (we show tolerance of over 100 logic faults via actual implementation on an FPGA consisting of a 20 × 20 array of logic blocks). A key novel feature is the reuse of defective logic blocks to increase the number of effective spares and extend the mission life. To increase fault tolerance, we not only use nonfaulty parts of defective or partially faulty logic blocks, but we also use faulty parts of defective logic blocks in nonfaulty modes. By using and reusing faulty resources, our multilevel approach extends the number of tolerable faults beyond the number of currently available spare logic resources. Unlike many column, row, or tile-based methods, our multilevel approach can tolerate not only faults that are evenly distributed over the logic area, but also clusters of faults in the same local area. Furthermore, system operation is not interrupted for fault diagnosis or for computing fault-bypassing configurations. Our fault tolerance techniques have been implemented using ORCA 2C series FPGAs which feature incremental dynamic runtime reconfiguration.