Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Supporting nondeterministic execution in fault-tolerant systems
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
TFT: A Software System for Application-Transparent Fault Tolerance
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
The design and implementation of Zap: a system for migrating computing environments
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Jockey: a user-space library for record-replay debugging
Proceedings of the sixth international symposium on Automated analysis-driven debugging
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
Framework for instruction-level tracing and analysis of program executions
Proceedings of the 2nd international conference on Virtual execution environments
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Automatically classifying benign and harmful data races using replay analysis
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
DejaView: a personal virtual computer recorder
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Transparent checkpoint-restart of multiple processes on commodity operating systems
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Decoupling dynamic program analysis from execution in virtual environments
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
ASSURE: automatic software self-healing using rescue points
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Efficient online validation with delta execution
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
DSF: a common platform for distributed systems research and development
Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
Multi-stage replay with crosscut
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
R2: an application-level kernel for record and replay
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Pervasive detection of process races in deployed systems
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Chronicler: lightweight recording to reproduce field failures
Proceedings of the 2013 International Conference on Software Engineering
Hi-index | 0.00 |
We present Dora, a mutable record-replay system which allows a recorded execution of an application to be replayed with a modified version of the application. This feature, not available in previous record-replay systems, enables powerful new functionality. In particular, Dora can help reproduce, diagnose, and fix software bugs by replaying a version of a recorded application that is recompiled with debugging information, reconfigured to produce verbose log output, modified to include additional print statements, or patched to fix a bug. Dora uses lightweight operating system mechanisms to record an application execution by capturing nondeterministic events to a log without imposing unnecessary timing and ordering constraints. It replays the log using a modified version of the application even in the presence of added, deleted, or modified operations that do not match events in the log. Dora searches for a replay that minimizes differences between the log and the replayed execution of the modified program. If there are no modifications, Dora provides deterministic replay of the unmodified program. We have implemented a Linux prototype which provides transparent mutable replay without recompiling or relinking applications. We show that Dora is useful for reproducing, diagnosing, and fixing software bugs in real-world applications, including Apache and MySQL. Our results show that Dora (1) captures bugs and replays them with applications modified or reconfigured to produce additional debugging output for root cause diagnosis, (2) captures exploits and replays them with patched applications to validate that the patches successfully eliminate vulnerabilities, (3) records production workloads and replays them with patched applications to validate patches with realistic workloads, and (4) maintains low recording overhead on commodity multicore hardware, making it suitable for production systems.