Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The distribution of faults in a large industrial software system
ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Countering code-injection attacks with instruction-set randomization
Proceedings of the 10th ACM conference on Computer and communications security
The design and implementation of Zap: a system for migrating computing environments
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatically Finding and Patching Bad Error Handling
EDCC '06 Proceedings of the Sixth European Dependable Computing Conference
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Building a reactive immune system for software services
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Enhancing server availability and security through failure-oblivious computing
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
FormatGuard: automatic protection from printf format string vulnerabilities
SSYM'01 Proceedings of the 10th conference on USENIX Security Symposium - Volume 10
Dynamic and adaptive updates of non-quiescent subsystems in commodity operating system kernels
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Transparent checkpoint-restart of multiple processes on commodity operating systems
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Preventing Memory Error Exploits with WIT
SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
ASSURE: automatic software self-healing using rescue points
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Automatically patching errors in deployed software
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
A few billion lines of code later: using static analysis to find bugs in the real world
Communications of the ACM
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Self-healing multitier architectures using cascading rescue points
Proceedings of the 28th Annual Computer Security Applications Conference
Techniques for efficient in-memory checkpointing
Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
Hi-index | 0.00 |
Software errors are frequently responsible for the limited availability of Internet Services, loss of data, and many security compromises. Self-healing using rescue points (RPs) is a mechanism that can be used to recover software from unforeseen errors until a more permanent remedy, like a patch or update, is available. We present REASSURE, a self-contained mechanism for recovering from such errors using RPs. Essentially, RPs are existing code locations that handle certain anticipated errors in the target application, usually by returning an error code. REASSURE enables the use of these locations to also handle unexpected faults. This is achieved by rolling back execution to a RP when a fault occurs, returning a valid error code, and enabling the application to gracefully handle the unexpected error itself. REASSURE can be applied on already running applications, while disabling and removing it is equally facile. We tested REASSURE with various applications, including the MySQL and Apache servers, and show that it allows them to successfully recover from errors, while incurring moderate overhead between 1% and 115%. We also show that even under very adverse conditions, like their continuous bombardment with errors, REASSURE protected applications remain operational.