A framework for system reliability analysis considering both system error tolerance and component test quality

Authors:
Sung-Jui Pan;Kwang-Ting Cheng
Affiliations:
University of California, Santa Barbara, CA;University of California, Santa Barbara, CA
Venue:
Proceedings of the conference on Design, automation and test in Europe
Year:
2007

Citing 8
Cited 2

Built-in test for VLSI: pseudorandom techniques

Built-in test for VLSI: pseudorandom techniques
Reliability of Computer Systems and Networks: Fault Tolerance,Analysis,and Design

Reliability of Computer Systems and Networks: Fault Tolerance,Analysis,and Design
Online BIST for Embedded Systems

IEEE Design & Test
An ATPG for Threshold Testing: Obtaining Acceptable Yield in Future Processes

ITC '02 Proceedings of the 2002 IEEE International Test Conference
Quadruple Time Redundancy Adders

DFT '03 Proceedings of the 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems
Determining error rate in error tolerant VLSI chips

DELTA '04 Proceedings of the Second IEEE International Workshop on Electronic Design, Test and Applications
Defect and Error Tolerance in the Presence of Massive Numbers of Defects

IEEE Design & Test
An Error-Oriented Test Methodology to Improve Yield with Error-Tolerance

VTS '06 Proceedings of the 24th IEEE VLSI Test Symposium

Vicis: a reliable network for unreliable silicon

Proceedings of the 46th Annual Design Automation Conference
On topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The failure rate, the sources of failures and the test costs for nanometer devices are all increasing. Therefore, to create a reliable system-on-a-chip device requires designers to implement fault tolerance. However, while system-level fault tolerance could significantly relax the quality requirements of the system's building blocks, every fault-tolerant scheme only works under certain failure mechanisms and within a certain range of error probabilities. Also, designing a system with a high failure-rate component could be very expensive because the growth rate of the design complexity and the system overhead for fault tolerance could be significantly greater than the component failure rate. Therefore, it is desirable to understand the trade-offs between component test quality and system fault-tolerant capability for achieving the desired reliability under cost constraints. In this paper, we propose an analysis framework for system reliability considering (a) the test quality achieved by manufacturing testing, on-line self-checking, and off-line built-in self-test; (b) the fault-tolerant and spare schemes; and (c) the component defect and error probabilities. We demonstrate that, through proper redundancy configurations and low-cost testing to insure a certain degree of component test quality, a low-redundant system might achieve equal or higher reliability than a high-redundant system.