Enhancing multicore reliability through wear compensation in online assignment and scheduling

  • Authors:
  • Thidapat Chantem;Yun Xiang;X. Sharon Hu;Robert P. Dick

  • Affiliations:
  • Utah State University, Logan, UT;University of Michigan, Ann Arbor, MI;University of Notre Dame, Notre Dame, IN;University of Michigan, Ann Arbor, MI

  • Venue:
  • Proceedings of the Conference on Design, Automation and Test in Europe
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

System reliability is a crucial concern especially in multicore systems which tend to have high power density and hence temperature. Existing reliability-aware methods are either slow and non-adaptive (offline techniques) or do not use task assignment and scheduling to compensate for uneven core wear states (online techniques). In this article, we present a dynamically-activated task assignment and scheduling algorithm based on theoretical results that explicitly optimizes system lifetime. We also propose a data distillation method that dramatically reduces the size of the thermal profiles to make full system reliability analysis viable online. Simulation results show that our algorithm results in between 27--291% improvement to system lifetime compared to existing techniques for four-core systems.