Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
The Case for Lifetime Reliability-Aware Microprocessors
Proceedings of the 31st annual international symposium on Computer architecture
Impact of NBTI on SRAM Read Stability and Design for Reliability
ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
A Framework for Architecture-Level Lifetime Reliability Modeling
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Circuit Failure Prediction and Its Application to Transistor Aging
VTS '07 Proceedings of the 25th IEEE VLSI Test Symmposium
Fully Adaptive Fault-Tolerant Routing Algorithm for Network-on-Chip Architectures
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Penelope: The NBTI-Aware Processor
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Self-calibrating Online Wearout Detection
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A reconfigurable routing algorithm for a fault-tolerant 2D-Mesh Network-on-Chip
Proceedings of the 45th annual Design Automation Conference
CASP: concurrent autonomous chip self-test using stored test patterns
Proceedings of the conference on Design, automation and test in Europe
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Architectural core salvaging in a multi-core processor for hard-error tolerance
Proceedings of the 36th annual international symposium on Computer architecture
Statistical reliability analysis under process variation and aging effects
Proceedings of the 46th Annual Design Automation Conference
The BubbleWrap many-core: popping cores for sequential acceleration
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
AgeSim: a simulation framework for evaluating the lifetime reliability of processor-based SoCs
Proceedings of the Conference on Design, Automation and Test in Europe
A highly resilient routing algorithm for fault-tolerant NoCs
Proceedings of the Conference on Design, Automation and Test in Europe
Combating Aging with the Colt Duty Cycle Equalizer
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
ACM SIGARCH Computer Architecture News
DRAIN: distributed recovery architecture for inaccessible nodes in multi-core chips
Proceedings of the 48th Design Automation Conference
NANOARCH '11 Proceedings of the 2011 IEEE/ACM International Symposium on Nanoscale Architectures
Towards graceful aging degradation in NoCs through an adaptive routing algorithm
Proceedings of the 49th Annual Design Automation Conference
ExtraTime: Modeling and analysis of wearout due to transistor aging at microarchitecture-level
DSN '12 Proceedings of the 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
An MILP-based aging-aware routing algorithm for NoCs
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Moore's Law scaling is continuing to yield even higher transistor density with each succeeding process generation, leading to today's multi-core Chip Multi-Processors (CMPs) with tens or even hundreds of interconnected cores or tiles. Unfortunately, deep sub-micron CMOS process technology is marred by increasing susceptibility to wearout. Prolonged operational stress gives rise to accelerated wearout and failure, due to several physical failure mechanisms, including Hot Carrier Injection (HCI) and Negative Bias Temperature Instability (NBTI). Each failure mechanism correlates with different usage-based stresses, all of which can eventually generate permanent faults. While the wearout of an individual core in many-core CMPs may not necessarily be catastrophic for the system, a single fault in the inter-processor Network-on-Chip (NoC) fabric could render the entire chip useless, as it could lead to protocol-level deadlocks, or even partition away vital components such as the memory controller or other critical I/O. In this paper, we develop critical path models for HCI- and NBTI-induced wear due to the actual stresses caused by real workloads, applied onto the interconnect microarchitecture. A key finding from this modeling being that, counter to prevailing wisdom, wearout in the CMP on-chip interconnect is correlated with lack of load observed in the NoC routers, rather than high load. We then develop a novel wearout-decelerating scheme in which routers under low load have their wearout-sensitive components exercised, without significantly impacting cycle time, pipeline depth, area or power consumption of the overall router. We subsequently show that the proposed design yields a 13.8x-65x increase in CMP lifetime.