Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Hypervisor-based fault tolerance
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Understanding the message logging paradigm for masking process crashes
Understanding the message logging paradigm for masking process crashes
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Support for Software Interrupts in Log-Based Rollback-Recovery
IEEE Transactions on Computers
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Message Logging: Pessimistic, Optimistic, Causal, and Optimal
IEEE Transactions on Software Engineering
Supporting nondeterministic execution in fault-tolerant systems
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes
Proceedings of the 30th annual international symposium on Computer architecture
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
When Virtual Is Better Than Real
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Minos: Control Data Attack Prevention Orthogonal to Memory Model
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
On deriving unknown vulnerabilities from zero-day polymorphic and metamorphic worm exploits
Proceedings of the 12th ACM conference on Computer and communications security
Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design)
Operating system support for virtual machines
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
ReCrash: Making Software Failures Reproducible by Preserving Object States
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Putting Trojans on the Horns of a Dilemma: Redundancy for Information Theft Detection
Transactions on Computational Science IV
Proceedings of the 4th International Symposium on Information, Computer, and Communications Security
Live migration of virtual machine based on full system trace and replay
Proceedings of the 18th ACM international symposium on High performance distributed computing
ReCrashJ: a tool for capturing and reproducing program crashes in deployed applications
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Automating performance testing of interactive Java applications
Proceedings of the 5th Workshop on Automation of Software Test
Automated GUI performance testing
Software Quality Control
System-Level support for intrusion recovery
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Hi-index | 0.00 |
Log-based recovery and replay systems are important for system reliability, debugging and postmortem analysis/recovery of malware attacks. These systems must incur low space and performance overhead, provide full-system replay capabilities, and be resilient against attacks. Previous approaches fail to meet these requirements: they replay only a single process, or require changes in the host and guest OS, or do not have a fully-implemented replay component. This paper studies full-system replay for uniprocessors by logging and replaying architectural events. To limit the amount of logged information, we identify architectural nondeterministic events, and encode them compactly. Here we present ExecRecorder, a full-system, VM-based, log and replay framework for post-attack analysis and recovery. ExecRecorder can replay the execution of an entire system by checkpointing the system state and logging architectural nondeterministic events, and imposes low performance overhead (less than 4% on average). In our evaluation its log files grow at about 5.4 GB/hour (arithmetic mean). Thus it is practical to log on the order of hours or days between checkpoints. It can also be integrated naturally with an IDS and a post-attack analysis tool for intrusion analysis and recovery.