Hardware dependability in the presence of soft errors

  • Authors:
  • Ashish Darbari;Bashir M. Al Hashimi

  • Affiliations:
  • School of Electronics and Computer Science, University of Southampton, England;School of Electronics and Computer Science, University of Southampton, England

  • Venue:
  • VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Using formal verification for designing hardware designs free from logic design bugs has been an active area of research since the last 15 years. Technology has matured and we have a choice of formal tools such as model checkers, equivalence checkers, and a range of theorem provers. Hardware reliability and fault tolerance has been studied for a long time as well, and some good solutions in the form of redundancy are available for making hardware resilient against faults. However, understanding the impact of a particular kind of fault known as a single-event-upset (SEU) or a transient fault especially in the context of low-power design is not well understood, and therefore achieving adequate tolerance for low-power processors against SEUs is still very much an open problem. A significant bottleneck in this has been the traditional fault injection methodology whereby the impact of a fault is analysed whilst a processor is running a specific binary program image. Thus the true impact of the fault is limited by the shadow of the particular program. Another key problem has been the modification of the original design to incorporate fault injection hardware. Thus, the design being checked for faults is different from the original design. In this paper we report on our experiences on studying transient fault injection on a 32 bit multi-cycle RISC processor using the formal specification and verification framework of Symbolic Trajectory Evaluation (STE). Our approach offers the benefit of studying fault injection by not modifying the original design and doing it in a program independent way. The vulnerability of the processor is assessed in terms of its architecural features, which is possible due to symbolic model checking.