Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
mSWAT: low-cost hardware fault detection and diagnosis for multicore systems
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Safe and automatic live update for operating systems
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
CrashTest'ing SWAT: accurate, gate-level evaluation of symptom-based resiliency solutions
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Back to the future: fault-tolerant live update with time-traveling state transfer
LISA'13 Proceedings of the 27th international conference on Large Installation System Administration
Hi-index | 0.00 |
In this paper we propose a unified architectural support that can be used flexibly for either soft-error protection or software bug detection. Our approach is based on dynamically detecting and enforcing instruction-level invariants. A hardware table is designed to keep track of run-time invariant information. During program execution, instructions access this table and compare their produced results against the stored invariants. Any violation of the predicted invariant suggests a potential abnormal behavior, which could be a result of a soft error or a latent software bug. In case of a soft error, monitoring invariant violations provides opportunistic soft-error protection to multiple structures in processor pipelines. Our experimental results show that invariant violations detect soft errors promptly and as a result, simple pipeline squashing is able to fix most of the detected soft errors. Meanwhile, the same approach can be easily adapted for software bug detection. The proposed architectural support eliminates the substantial performance overhead associated with software-based bug-detection approaches and enables continuous monitoring of production code.