Tight bounds for minimax grid matching, with applications to the average case analysis of algorithms
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
REMOD: a new methodology for designing fault-tolerant arithmetic circuits
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Methodologies for Tolerating Cell and Interconnect Faults in FPGAs
IEEE Transactions on Computers
Low overhead fault-tolerant FPGA systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On-line fault detection for bus-based field programmable gate arrays
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Tolerating operational faults in cluster-based FPGAs
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
A Fault Tolerant Technique for FPGAs
Journal of Electronic Testing: Theory and Applications
Using embedded FPGAs for SoC yield improvement
Proceedings of the 39th annual Design Automation Conference
Defect Tolerant SRAM Based FPGAs
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
Algorithms for Efficient Runtime Fault Recovery on Diverse FPGA Architectures
DFT '99 Proceedings of the 14th International Symposium on Defect and Fault-Tolerance in VLSI Systems
FPL '97 Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications
Dynamic Fault Tolerance in FPGAs via Partial Reconfiguration
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
On the Necessity of On-line-BIST in Safety-Critical Applications - A Case-Study
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
BIST-Based Diagnosis of FPGA Interconnect
ITC '02 Proceedings of the 2002 IEEE International Test Conference
EH '01 Proceedings of the The 3rd NASA/DoD Workshop on Evolvable Hardware
Using Roving STARs for On-Line Testing and Diagnosis of FPGAs in Fault-Tolerant Applications
ITC '99 Proceedings of the 1999 IEEE International Test Conference
A survey of fault tolerant methodologies for FPGAs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Online BIST and BIST-based diagnosis of FPGA logic blocks
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
System-on-Chip Test Architectures: Nanometer Design for Testability
System-on-Chip Test Architectures: Nanometer Design for Testability
Fault tolerant techniques for reconfigurable platforms
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Progress in autonomous fault recovery of field programmable gate arrays
ACM Computing Surveys (CSUR)
Heuristic search for adaptive, defect-tolerant multiprocessor arrays
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
The survivability of design-specific spare placement in FPGA architectures with high defect rates
ACM Transactions on Design Automation of Electronic Systems (TODAES)
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Most adaptive computing systems use reconfigurable hardware in the form of field programmable gate arrays (FPGAs). For these systems to be fielded in harsh environments where high reliability and availability are a must, the applications running on the FPGAs must tolerate hardware faults that may occur during the lifetime of the system. In this paper, we present new fault-tolerant techniques for FPGA logic blocks, developed as part of the roving self-test areas (STARs) approach to online testing, diagnosis, and reconfiguration [1]. Our techniques can handle large numbers of faults (we show tolerance of over 100 logic faults via actual implementation on an FPGA consisting of a 20 × 20 array of logic blocks). A key novel feature is the reuse of defective logic blocks to increase the number of effective spares and extend the mission life. To increase fault tolerance, we not only use nonfaulty parts of defective or partially faulty logic blocks, but we also use faulty parts of defective logic blocks in nonfaulty modes. By using and reusing faulty resources, our multilevel approach extends the number of tolerable faults beyond the number of currently available spare logic resources. Unlike many column, row, or tile-based methods, our multilevel approach can tolerate not only faults that are evenly distributed over the logic area, but also clusters of faults in the same local area. Furthermore, system operation is not interrupted for fault diagnosis or for computing fault-bypassing configurations. Our fault tolerance techniques have been implemented using ORCA 2C series FPGAs which feature incremental dynamic runtime reconfiguration.