The Hyeti Defect Tolerant Microprocessor: A Practical Experiment and its Cost-Effectiveness Analysis
IEEE Transactions on Computers
Low overhead fault-tolerant FPGA systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Coarse grain reconfigurable architecture (embedded tutorial)
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
NanoFabrics: spatial computing using molecular electronics
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Designing for Yield: A Defect-Tolerant Approach to High-Level Synthesis
DFT '98 Proceedings of the 13th International Symposium on Defect and Fault-Tolerance in VLSI Systems
Computer-Aided Fault to Defect Mapping (CAFDM) for Defect Diagnosis
ITC '00 Proceedings of the 2000 IEEE International Test Conference
A Scalable,Low Cost Design-for-Test Architecture for UltraSPARC" Chip Multi-Processors
ITC '02 Proceedings of the 2002 IEEE International Test Conference
The Alpha 21264 Microprocessor Architecture
ICCD '98 Proceedings of the International Conference on Computer Design
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Complexity-effective superscalar processors
Complexity-effective superscalar processors
Testing of Digital Systems
Exploiting Microarchitectural Redundancy For Defect Tolerance
ICCD '03 Proceedings of the 21st International Conference on Computer Design
Tolerating Hard Faults in Microprocessor Array Structures
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Fault-tolerant design of the IBM pSeries 690 system using POWER4 processor technology
IBM Journal of Research and Development
A Mechanism for Online Diagnosis of Hard Faults in Microprocessors
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Online diagnosis of hard faults in microprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Architectural core salvaging in a multi-core processor for hard-error tolerance
Proceedings of the 36th annual international symposium on Computer architecture
On topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Necromancer: enhancing system throughput by animating dead cores
Proceedings of the 37th annual international symposium on Computer architecture
Trifecta: a nonspeculative scheme to exploit common, data-dependent subcritical paths
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Scalable thread scheduling and global power management for heterogeneous many-core architectures
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Design techniques for cross-layer resilience
Proceedings of the Conference on Design, Automation and Test in Europe
Exploring circuit timing-aware language and compilation
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
ROSY: recovering processor and memory systems from hard errors
ACM SIGOPS Operating Systems Review
A fault tolerant NoC architecture using quad-spare mesh topology and dynamic reconfiguration
Journal of Systems Architecture: the EUROMICRO Journal
Cost-effective lifetime and yield optimization for NoC-based MPSoCs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hi-index | 0.00 |
Scaling feature size improves processor performance but increases each deviceýs susceptibility to defects (i.e., hard errors). As a result, fabrication technology must improve significantly to maintain yields. Redundancy techniques in memory have been successful at improving yield in the presence of defects. Apart from core sparing which disables faulty cores in a chip multiprocessor, little has been done to target the core logic. While previous work has proposed that either inherent or added redundancy in the core logic can be used to tolerate defects, the key issues of realistic testing and fault isolation have been ignored. This paper is the first to consider testability and fault isolation in designing modern high-performance, defect-tolerant microarchitectures. We define intra-cycle logic independence (ICI) as the condition needed for conventional scan test to isolate faults quickly to the microarchitectural-block granularity. We propose logic transformations to redesign conventional superscalar microarchitecture to comply with ICI. We call our novel, testable, and defect-tolerant microarchitecture Rescue. We build a verilog model of Rescue and verify that faults can be isolated to the required precision using only conventional scan test. Using performace simulations, we show that ICI transformations reduce IPC only by 4% on average for SPEC2000 programs. Taking yield improvement into account, Rescue improves average yield-adjusted instruction throughput over core sparing by 12% and 22% at 32nm and 18nm technology nodes, respectively.