Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Fault Injection for Dependability Validation: A Methodology and Some Applications
IEEE Transactions on Software Engineering
Fault Injection Experiments Using FIAT
IEEE Transactions on Computers
FERRARI: A Flexible Software-Based Fault and Error Injection System
IEEE Transactions on Computers - Special issue on fault-tolerant computing
ACM SIGSOFT Software Engineering Notes
Experiments on six commercial TCP implementations using a software fault injection tool
Software—Practice & Experience
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Measuring Fault Tolerance with the FTAPE Fault Injection Tool
MMB '95 Proceedings of the 8th International Conference on Modelling Techniques and Tools for Computer Performance Evaluation: Quantitative Evaluation of Computing and Communication Systems
The Systematic Improvement of Fault Tolerance in the Rio File Cache
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Automatic Failure-Path Inference: A Generic Introspection Technique for Internet Applications
WIAPP '03 Proceedings of the The Third IEEE Workshop on Internet Applications
Practices of Software Maintenance
ICSM '98 Proceedings of the International Conference on Software Maintenance
Using fault injection to increase software test coverage
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
A Framework for Assessing Dependability in Distributed Systems with Lightweight Fault Injectors
IPDS '00 Proceedings of the 4th International Computer Performance and Dependability Symposium
Testing of java web services for robustness
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Proceedings of the twentieth ACM symposium on Operating systems principles
Automatically Finding and Patching Bad Error Handling
EDCC '06 Proceedings of the Sixth European Dependable Computing Conference
Proceedings of the third ACM SIGPLAN conference on History of programming languages
Verification and Validation of (Real Time) COTS Products using Fault Injection Techniques
ICCBSS '07 Proceedings of the Sixth International IEEE Conference on Commercial-off-the-Shelf (COTS)-Based Software Systems
Profiling and tracing dynamic library usage via interposition
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
Detours: binary interception of Win32 functions
WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
Parameter and Return-value Analysis of Binary Executables
COMPSAC '07 Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01
Exceptional situations and program reliability
ACM Transactions on Programming Languages and Systems (TOPLAS)
On the Impact of Injection Triggers for OS Robustness Evaluation
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
EIO: error handling is occasionally correct
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Expect the unexpected: error code mismatches between documentation and the real world
Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
R2: an application-level kernel for record and replay
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
An extensible technique for high-precision testing of recovery code
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
S2E: a platform for in-vivo multi-path analysis of software systems
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Life, death, and the critical transition: finding liveness bugs in systems code
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Fast black-box testing of system recovery code
Proceedings of the 7th ACM european conference on Computer Systems
Diagnosys: automatic generation of a debugging interface to the Linux kernel
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Be conservative: enhancing failure diagnosis with proactive logging
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Adversarial testing of wireless routing implementations
Proceedings of the sixth ACM conference on Security and privacy in wireless and mobile networks
On fault resilience of OpenStack
Proceedings of the 4th annual Symposium on Cloud Computing
Hi-index | 0.00 |
A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.