Low Overhead Soft Error Mitigation Techniques for High-Performance and Aggressive Designs

  • Authors:
  • Naga Durga Prasad Avirneni;Arun Somani

  • Affiliations:
  • Iowa State University, Ames;Iowa State University , Ames

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 2012

Quantified Score

Hi-index 14.98

Visualization

Abstract

The threat of soft error induced system failure in computing systems has become more prominent, as we adopt ultradeep submicron process technologies. In this paper, we propose two efficient soft error mitigation schemes, namely, Soft Error Mitigation (SEM) and Soft and Timing Error Mitigation (STEM), using the approach of multiple clocking of data for protecting combinational logic blocks from soft errors. Our first technique, SEM, based on distributed and temporal voting of three registers, unloads the soft error detection overhead from the critical path of the systems. SEM is also capable of ignoring false errors and recovers from soft errors using in-situ fast recovery avoiding recomputation. Our second technique, STEM, while tolerating soft errors, adds timing error detection capability to guarantee reliable execution in aggressively clocked designs that enhance system performance by operating beyond worst-case clock frequency. We also present a specialized low overhead clock phase management scheme that ably supports our proposed techniques. Timing-annotated gate-level simulations, using 45 nm libraries, of a pipelined adder-multiplier and DLX processor show that both our techniques achieve near 100 percent fault coverage. For DLX processor, even under severe fault injection campaigns, SEM achieves an average performance improvement of 26.58 percent over a conventional triple modular redundancy voter-based soft error mitigation scheme, while STEM outperforms SEM by 27.42 percent.