The Impact of Technology Scaling on Lifetime Reliability

  • Authors:
  • Jayanth Srinivasan;Sarita V. Adve;Pradip Bose;Jude A. Rivers

  • Affiliations:
  • University of Illinois, Urbana-Champaign;University of Illinois, Urbana-Champaign;IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The relentless scaling of CMOS technology has provideda steady increase in processor performance for the pastthree decades. However, increased power densities (hencetemperatures) and other scaling effects have an adverse impacton long-term processor lifetime reliability. This paperrepresents a first attempt at quantifying the impact of scalingon lifetime reliability due to intrinsic hard errors, takingworkload characteristics into consideration.For our quantitative evaluation, we use RAMP [The Case for Microarchitectural Awareness of Lifetime Reliability], a previously proposed industrial-strength model that providesreliability estimates for a workload, but for a given technology.We extend RAMP by adding scaling specific parametersto enable workload-dependent lifetime reliability evaluationat different technologies.We show that (1) scaling has a significant impact on processorhard failure rates - on average, with SPEC benchmarks,we find the failure rate of a scaled 65nm processorto be 316% higher than a similarly pipelined 180nm processor;(2) time-dependent dielectric breakdown and electromigrationhave the largest increases; and (3) with scaling,the difference in reliability from running at worst-casevs. typical workload operating conditions increases significantly,as does the difference from running different workloads.Our results imply that leveraging a single microarchitecturedesign for multiple remaps across a few technologygenerations will become increasingly difficult, and motivatea need for workload specific, microarchitectural lifetimereliability awareness at an early design stage.