Algorithm Level Fault Tolerance: A Technique to Cope with Long Duration Transient Faults in Matrix Multiplication Algorithms

  • Authors:
  • Carlos Arthur Lang Lisboa;Costas Argyrides;Dhiraj Kumar Pradhan;Luigi Carro

  • Affiliations:
  • -;-;-;-

  • Venue:
  • VTS '08 Proceedings of the 26th IEEE VLSI Test Symposium
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

For technologies beyond the 45 nm node, radiation induced transients will last longer than one clock cycle. In this scenario, temporal redundancy techniques will no longer be able to cope with radiation induced soft errors, while spatial redundancy techniques still impose high power and area overheads. The solution to this impasse is the use of algorithm level techniques, able to detect and correct errors with low cost. In this paper, a new approach to deal with this problem is proposed, and applied to matrix multiplication algorithm. The proposed technique is compared to previously published fault tolerance techniques, and the costs of detection and recomputation for both approaches are compared and discussed.