Adaptive online testing for efficient hard fault detection

  • Authors:
  • Shantanu Gupta;Amin Ansari;Shuguang Feng;Scott Mahlke

  • Affiliations:
  • Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor;Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor;Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor;Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor

  • Venue:
  • ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

With growing semiconductor integration, the reliability of individual transistors is expected to rapidly decline in future technology generations. In such a scenario, processors would need to be equipped with fault tolerance mechanisms to tolerate in-field silicon defects. Periodic online testing is a popular technique to detect such failures; however, it tends to impose a heavy testing penalty. In this paper, we propose an adaptive online testing framework to significantly reduce the testing overhead. The proposed approach is unique in its ability to assess the hardware health and apply suitably detailed tests. Thus, a significant chunk of the testing time can be saved for the healthy components. We further extend the framework to work with the StageNet CMP fabric, which provides the flexibility to group together pipeline stages with similar health conditions, thereby reducing the overall testing burden. For a modest 2.6% sensor area overhead, the proposed scheme was able to achieve an 80% reduction in software test instructions over the lifetime of a 16-core CMP.