DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable software-based self-test methodology for programmable processors
Proceedings of the 40th annual Design Automation Conference
Instruction Randomization Self Test For Processor Cores
VTS '99 Proceedings of the 1999 17TH IEEE VLSI Test Symposium
Exploiting Microarchitectural Redundancy For Defect Tolerance
ICCD '03 Proceedings of the 21st International Conference on Computer Design
The Impact of Technology Scaling on Lifetime Reliability
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
On-Chip Test Generation Using Linear Subspaces
ETS '06 Proceedings of the Eleventh IEEE European Test Symposium
Configurable isolation: building high availability systems with commodity multi-core processors
Proceedings of the 34th annual international symposium on Computer architecture
Circuit Failure Prediction and Its Application to Transistor Aging
VTS '07 Proceedings of the 25th IEEE VLSI Test Symmposium
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Self-calibrating Online Wearout Detection
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A hybrid software-based self-testing methodology for embedded processor
Proceedings of the 2008 ACM symposium on Applied computing
ISQED '08 Proceedings of the 9th international symposium on Quality Electronic Design
The StageNet fabric for constructing resilient multicore systems
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Multi-mechanism reliability modeling and management in dynamic systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Effective software-based self-test strategies for on-line periodic testing of embedded processors
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Application-aware diagnosis of runtime hardware faults
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
With growing semiconductor integration, the reliability of individual transistors is expected to rapidly decline in future technology generations. In such a scenario, processors would need to be equipped with fault tolerance mechanisms to tolerate in-field silicon defects. Periodic online testing is a popular technique to detect such failures; however, it tends to impose a heavy testing penalty. In this paper, we propose an adaptive online testing framework to significantly reduce the testing overhead. The proposed approach is unique in its ability to assess the hardware health and apply suitably detailed tests. Thus, a significant chunk of the testing time can be saved for the healthy components. We further extend the framework to work with the StageNet CMP fabric, which provides the flexibility to group together pipeline stages with similar health conditions, thereby reducing the overall testing burden. For a modest 2.6% sensor area overhead, the proposed scheme was able to achieve an 80% reduction in software test instructions over the lifetime of a 16-core CMP.