ACM Computing Surveys (CSUR)
Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers
IEEE Transactions on Software Engineering
RIFLE: A General Purpose Pin-level Fault Injector
EDCC-1 Proceedings of the First European Dependable Computing Conference on Dependable Computing
Considering Workload Input Variations in Error Coverage Estimation
EDCC-3 Proceedings of the Third European Dependable Computing Conference on Dependable Computing
Comparison of Physical and Software-Implemented Fault Injection Techniques
IEEE Transactions on Computers
A New Approach to Software-Implemented Fault Tolerance
Journal of Electronic Testing: Theory and Applications
SWIFT: Software Implemented Fault Tolerance
Proceedings of the international symposium on Code generation and optimization
Software Implemented Detection and Recovery of Soft Errors in a Brake-by-Wire System
EDCC-7 '08 Proceedings of the 2008 Seventh European Dependable Computing Conference
Fault injection-based assessment of aspect-oriented implementation of fault tolerance
DSN '11 Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks
Compiler-Directed Soft Error Mitigation for Embedded Systems
IEEE Transactions on Dependable and Secure Computing
Assembly-Level pre-injection analysis for improving fault injection efficiency
EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Hi-index | 0.00 |
Technology scaling of integrated circuits is making transistors increasingly sensitive to process variations, wear-out effects and ionizing particles. This may lead to an increasing rate of transient and intermittent errors in future microprocessors. In order to assess the risk such errors pose to safety critical systems, it is essential to investigate how temporary errors in the instruction set architecture (ISA) registers and main memory locations influence the behaviour of executing programs. To this end, we investigate --- by means of extensive fault injection experiments --- how such errors affect the execution of four target programs. The paper makes three contributions. First, we investigate how the failure modes of the target programs vary for different input sets. Second, we evaluate the error coverage of a software-implemented hardware fault tolerant technique that relies on triple-time redundant execution, majority voting and forward recovery. Third, we propose an approach based on assembly language metrics which can be used to correlate the dynamic fault-free behaviour of a program with its failure mode distribution obtained by fault injection.