A Low Cost Scheme for Reducing Silent Data Corruption in Large Arithmetic Circuits

  • Authors:
  • Abhisek Pan;James W. Tschanz;Sandip Kundu

  • Affiliations:
  • -;-;-

  • Venue:
  • DFT '08 Proceedings of the 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Aggressive scaling of CMOS transistors in last four decades has resulted in circuits with progressively higher packing density, increased switching speed, and higher power density. However in future, CMOS technology nodes are predicted to suffer from greater intermediate to long-term reliability and circuit marginality problems. To address these problems researchers have proposed the usage of redundant circuits to detect and, in some cases, to correct transient or permanent field failures. The proposed solutions target 100% of circuit errors but are expensive in terms of area and more importantly power overhead, some times exceeding 200%. In this paper, we investigate a flexible, lower-overhead error detection scheme that provides trade-off between reliability and circuit overhead in terms of area, performance, and power. Simulation studies on a 32-bit multiplier circuit show that this scheme can provide greater than 90% fault coverage with 15 to 20% area overhead including overhead for the comparator circuitry. Additionally, the redundant portion of the circuit can be turned off when concurrent checking is not critical, resulting in power savings. Currently, mainstream desktop processors use low-cost error detection schemes in busses, IOs and sometimes in embedded memories to reduce likelihood of silent data corruption (SDC). The discussed approach provides significant reduction in likelihood of silent data corruption in generally unprotected arithmetic circuits at a small cost.