Tolerating operational faults in cluster-based FPGAs

Authors:
Vijay Lakamraju;Russell Tessier
Affiliations:
Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA;Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA
Venue:
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
Year:
2000

Citing 13
Cited 19

A CAD system for the design of field programmable gate arrays

DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Reliable computer systems (2nd ed.): design and evaluation

Reliable computer systems (2nd ed.): design and evaluation
Placement and routing tools for the Triptych FPGA

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Generation of synthetic sequential benchmark circuits

FPGA '97 Proceedings of the 1997 ACM fifth international symposium on Field-programmable gate arrays
A fast routability-driven router for FPGAs

FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
Methodologies for Tolerating Cell and Interconnect Faults in FPGAs

IEEE Transactions on Computers
Data security for Web-based CAD

DAC '98 Proceedings of the 35th annual Design Automation Conference
On-line fault detection for bus-based field programmable gate arrays

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Partial reconfiguration of FPGA mapped designs with applications to fault tolerance and yield enhancement

FPL '97 Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications
VPR: A new packing, placement and routing tool for FPGA research

FPL '97 Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications
Timing Driven Placement Reconfiguration for Fault Tolerance and Yield Enhancement in FPGAs

EDTC '96 Proceedings of the 1996 European conference on Design and Test
The RAW benchmark suite: computation structures for general purpose computing

FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Performance of interconnection rip-up and reroute strategies

DAC '81 Proceedings of the 18th Design Automation Conference

Interconnect testing in cluster-based FPGA architectures

Proceedings of the 37th Annual Design Automation Conference
A memory coherence technique for online transient error recovery of FPGA configurations

FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Using embedded FPGAs for SoC yield improvement

Proceedings of the 39th annual Design Automation Conference
Reconfigurable Computing for Digital Signal Processing: A Survey

Journal of VLSI Signal Processing Systems
Diagnosis of interconnect faults in cluster-based FPGA architectures

Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
Performance Penalty for Fault Tolerance in Roving STARs

FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Cluster-based detection of SEU-caused errors in LUTs of SRAM-based FPGAs

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Hybrid CMOS/nanoelectronic digital circuits: devices, architectures, and design automation

ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
A survey of fault tolerant methodologies for FPGAs

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Design of the EPLD-based reconfigurable fault-tolerant systems with cell-level redundancy

Automation and Remote Control
Online fault tolerance for FPGA logic blocks

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Choose-your-own-adventure routing: lightweight load-time defect avoidance

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Fault tolerant placement and defect reconfiguration for nano-FPGAs

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Progress in autonomous fault recovery of field programmable gate arrays

ACM Computing Surveys (CSUR)
Choose-your-own-adventure routing: Lightweight load-time defect avoidance

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Placement of repair circuits for in-field FPGA repair

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
The survivability of design-specific spare placement in FPGA architectures with high defect rates

ACM Transactions on Design Automation of Electronic Systems (TODAES)
FPGA fault tolerant arithmetic logic: a case study using parallel-prefix adders

VLSI Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years the application space of reconfigurable devices has grown to include many platforms with a strong need for fault tolerance. While these systems frequently contain hardware redundancy to allow for continued operation in the presence of operational faults, the need to recover faulty hardware and return it to full functionality quickly and efficiently is great. In addition to providing functional density, FPGAs provide a level of fault tolerance generally not found in mask-programmable devices by including the capability to reconfigure around operational faults in the field. In this paper, incremental CAD techniques are described that allow functional recovery of FPGA design configurations in the presence of single or multiple operational faults. Our preferred approach to fault recovery takes advantage of device routing hierarchy in architectural families such as Xilinx Virtex [2] and Altera Apex [3] to quickly swap unused logic and routing resources in place of faulty ones within logic clusters. These algorithms allow for straight-forward implementation within a local fault-tolerant system without the need to access a remote processing location. If initial recovery attempts through localized swapping fail, an incremental router based on the widely-used PathFinder maze routing algorithm [10] can be applied remotely in an attempt to form connections between newly-allocated logic and interconnect based on the history of the initial design route.