Workload assignment considering NBTI degradation in multicore systems

Authors:
Jin Sun;Roman Lysecky;Karthik Shankar;Avinash Kodi;Ahmed Louri;Janet Roveda
Affiliations:
The University of Arizona, Tucson, AZ;The University of Arizona, Tucson, AZ;University of Texas at Austin, Austin, TX;Ohio University, Athens, OH;The University of Arizona, Tucson, AZ;The University of Arizona, Tucson, AZ
Venue:
ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Year:
2014

Citing 22
Cited 0

Integer and combinatorial optimization

Integer and combinatorial optimization
Task scheduling in parallel and distributed systems

Task scheduling in parallel and distributed systems
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
NetBench: a benchmarking suite for network processors

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Temperature-aware microarchitecture: Modeling and implementation

ACM Transactions on Architecture and Code Optimization (TACO)
The Impact of Technology Scaling on Lifetime Reliability

DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
A probabilistic framework to estimate full-chips subthreshold leakage power distribution considering within-die and die-to-die P-T-V variations

Proceedings of the 2004 international symposium on Low power electronics and design
Thermal-Aware Task Allocation and Scheduling for Embedded Systems

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Exploiting Structural Duplication for Lifetime Reliability Enhancement

Proceedings of the 32nd annual international symposium on Computer Architecture
Power-aware scheduling and dynamic voltage setting for tasks running on a hard real-time system

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Communication-aware allocation and scheduling framework for stream-oriented multi-processor systems-on-chip

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Architecting a reliable CMP switch architecture

ACM Transactions on Architecture and Code Optimization (TACO)
Temperature-aware NBTI modeling and the impact of input vector control on performance degradation

Proceedings of the conference on Design, automation and test in Europe
Temperature aware task scheduling in MPSoCs

Proceedings of the conference on Design, automation and test in Europe
The impact of NBTI on the performance of combinational and sequential circuits

Proceedings of the 44th annual Design Automation Conference
Penelope: The NBTI-Aware Processor

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
EVAL: Utilizing processors with variation-induced timing errors

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
NBTI-aware DVFS: a new approach to saving energy and increasing processor lifetime

Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Minimization of NBTI performance degradation using internal node control

Proceedings of the Conference on Design, Automation and Test in Europe
TG-based technique for NBTI degradation and leakage optimization

Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Chebyshev Affine-Arithmetic-Based Parametric Yield Prediction Under Limited Descriptions of Uncertainty

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With continuously shrinking technology, reliability issues such as Negative Bias Temperature Instability (NBTI) has resulted in considerable degradation of device performance, and eventually the short mean-time-to-failure (MTTF) of the whole multicore system. This article proposes a new workload balancing scheme based on device-level fractional NBTI model to balance the workload among active cores while relaxing stressed ones. Starting with NBTI-induced threshold voltage degradation, we define a concept of Capacity Rate (CR) as an indication of one core's ability to accept workload. Capacity rate captures core's performance variability in terms of delay and power metrics under the impact of NBTI aging. The proposed workload balancing framework employs the capacity rates as workload constraints, applies a Dynamic Zoning (DZ) algorithm to group cores into zones to process task flows, and then uses Dynamic Task Scheduling (DTS) to allocate tasks in each zone with balanced workload and minimum communication cost. Experimental results on a 64-core system show that by allowing a small part of the cores to relax over a short time period, the proposed methodology improves multicore system yield (percentage of core failures) by 20%, while extending MTTF by 30% with insignificant degradation in performance (less than 3%).