Dynamic transient fault detection and recovery for embedded processor datapaths

  • Authors:
  • Garo Bournoutian;Alex Orailoglu

  • Affiliations:
  • University of California, San Diego, La Jolla, CA, USA;University of California, San Diego, La Jolla, CA, USA

  • Venue:
  • Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As microprocessors continue to evolve and grow in functionality, the use of smaller nanometer technology scaling coupled with high clock frequencies and exponentially increasing transistor counts dramatically increases the susceptibility of transient faults. However, the correct and reliable operation of these processors is often compulsory, both in terms of consumer experience and for high-risk embedded domains such as medical and transportation systems. Thus, economical fault detection and recovery becomes essential to meet all necessary market requirements. This paper explores the efficient leveraging of superscalar, out-of-order architectures to enable multi-cycle transient fault-tolerance throughout the datapath in a novel manner. By using dynamic instruction execution redundancy, soft errors within the datapath are both detected and recovered. The proposed microarchitecture selectively reevaluates corrupted instructions, reducing the recovery impact by preserving completed instructions unaffected by the fault. The additional computational workload is dynamically staggered to leverage the out-of-order nature of the architecture and minimize resource conflicts and delays.