Credit-based dynamic reliability management using online wearout detection

  • Authors:
  • John Oliver;Rajeevan Amirtharajah;Venkatesh Akella;Frederic T. Chong

  • Affiliations:
  • Cal Poly State University, San Luis Obispo, CA, USA;University of California, Davis, CA, USA;University of California, Davis, CA, USA;University of California, Santa Barbara, CA, USA

  • Venue:
  • Proceedings of the 5th conference on Computing frontiers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

As circuit geometries continue to shrink, and supply voltages remain relatively constant, circuit wearout becomes a concern. We propose that the relative reliability of the circuits of a processor be exposed to the operating system, and be managed by a credit-based wearout monitor. This wearout monitor receives dynamic updates of the reliability of circuits through the use of stability detector circuits that are small enough to be widely deployed. We find that through the combined use of the wearout monitor and stability detectors, we can efficiently and accurately manage the reliability of a processor, and re-coup the performance of a processor that would otherwise be lost when processors are over-provisioned to meet an expected lifetime. We simulate a 16 core DSP with a wearout monitor and stability detectors on a mix of four different media algorithms. Using the wearout monitor and stability detectors, we find that by reducing average performance by only 5%, we can increase the lifetime of the processor by 46%.