Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
A software instruction counter
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
IGOR: a system for program debugging via reversible execution
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Hardware-assisted replay of multiprocessor programs
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Optimal tracing and incremental reexecution for debugging long-running programs
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Replay for concurrent non-deterministic shared-memory applications
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Efficient algorithms for bidirectional debugging
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Communications of the ACM
Reversible Debugging Using Program Instrumentation
IEEE Transactions on Software Engineering
An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
An Execution-Backtracking Approach to Debugging
IEEE Software
Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Scale and performance in the Denali isolation kernel
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Optimizing the migration of virtual computers
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Operating system support for virtual machines
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Self-securing storage: protecting data in compromised system
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Unmodified device driver reuse and improved system dependability via virtual machines
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A user-mode port of the linux kernel
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Jockey: a user-space library for record-replay debugging
Proceedings of the sixth international symposium on Automated analysis-driven debugging
On the design of a pervasive debugger
Proceedings of the sixth international symposium on Automated analysis-driven debugging
Software and the Concurrency Revolution
Queue - Multiprocessors
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
Virtualization for high-performance computing
ACM SIGOPS Operating Systems Review
S2DB: a novel simulation-based debugger for sensor network applications
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Replayer: automatic protocol replay by binary analysis
Proceedings of the 13th ACM conference on Computer and communications security
Emergent (mis)behavior vs. complex software systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Framework for instruction-level tracing and analysis of program executions
Proceedings of the 2nd international conference on Virtual execution environments
A Technique for Enabling and Supporting Debugging of Field Failures
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Automatic on-line failure diagnosis at the end-user site
HOTDEP'06 Proceedings of the 2nd conference on Hot Topics in System Dependability - Volume 2
Are virtual machine monitors microkernels done right?
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Parallax: managing storage for a million machines
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Replay debugging for distributed applications
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
JIT instrumentation: a novel approach to dynamically instrument operating systems
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Sweeper: a lightweight end-to-end system for defending against fast worms
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Efficient checkpointing of java software using context-sensitive capture and replay
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Triage: diagnosing production run failures at the user's site
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
/*icomment: bugs or bad comments?*/
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
DejaView: a personal virtual computer recorder
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Stealthy malware detection through vmm-based "out-of-the-box" semantic view reconstruction
Proceedings of the 14th ACM conference on Computer and communications security
An independent audit framework for software dependent voting systems
Proceedings of the 14th ACM conference on Computer and communications security
Simulation-based augmented reality for sensor network development
Proceedings of the 5th international conference on Embedded networked sensor systems
Virtual machine time travel using continuous data protection and checkpointing
ACM SIGOPS Operating Systems Review
Parallax: virtual disks for virtual machines
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Remus: high availability via asynchronous virtual machine replication
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Decoupling dynamic program analysis from execution in virtual environments
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Architecting Dependable and Secure Systems Using Virtualization
Architecting Dependable Systems V
Efficiently tracking application interactions using lightweight virtualization
Proceedings of the 1st ACM workshop on Virtual machine security
ASSURE: automatic software self-healing using rescue points
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
First-aid: surviving and preventing memory management bugs during production runs
Proceedings of the 4th ACM European conference on Computer systems
Transparent checkpoints of closed distributed systems in Emulab
Proceedings of the 4th ACM European conference on Computer systems
Tralfamadore: unifying source code and execution experience
Proceedings of the 4th ACM European conference on Computer systems
FlashBox: a system for logging non-deterministic events in deployed embedded systems
Proceedings of the 2009 ACM symposium on Applied Computing
Live migration of virtual machine based on full system trace and replay
Proceedings of the 18th ACM international symposium on High performance distributed computing
Software Profiling for Deterministic Replay Debugging of User Code
Proceedings of the 2006 conference on New Trends in Software Methodologies, Tools and Techniques: Proceedings of the fifth SoMeT_06
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
VCR: Virtual Capture and Replay for Performance Testing
ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
Offline symbolic analysis for multi-processor execution replay
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
ACM Transactions on Information and System Security (TISSEC)
Capability wrangling made easy: debugging on a microkernel with valgrind
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Multi-stage replay with crosscut
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
SherLog: error diagnosis by connecting clues from run-time logs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Analyzing multicore dumps to facilitate concurrency bug reproduction
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Reverse engineering of binary device drivers with RevNIC
Proceedings of the 5th European conference on Computer systems
Otherworld: giving applications a chance to survive OS kernel crashes
Proceedings of the 5th European conference on Computer systems
NOVA: a microhypervisor-based secure virtualization architecture
Proceedings of the 5th European conference on Computer systems
Execution synthesis: a technique for automated software debugging
Proceedings of the 5th European conference on Computer systems
PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
"Out-of-the-Box" monitoring of VM-based high-interaction honeypots
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Efficient model checking of applications with input/output
EUROCAST'07 Proceedings of the 11th international conference on Computer aided systems theory
Video indexed VM continuous checkpoints: time travel support for virtual 3d graphics applications
Proceedings of the 20th international workshop on Network and operating systems support for digital audio and video
"Otherworld": giving applications a chance to survive OS kernel crashes
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
Lightweight, high-resolution monitoring for troubleshooting production systems
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
R2: an application-level kernel for record and replay
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Testing closed-source binary device drivers with DDT
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Dynamic and transparent analysis of commodity production systems
Proceedings of the IEEE/ACM international conference on Automated software engineering
Language-based replay via data flow cut
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Monitoring, analysis, and testing of deployed software
Proceedings of the FSE/SDP workshop on Future of software engineering research
Checkpoint/restart-enabled parallel debugging
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
InstantCheck: Checking the Determinism of Parallel Programs Using On-the-Fly Incremental Hashing
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Rethink the virtual machine template
Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Fast and space-efficient virtual machine checkpointing
Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Debugging large scale applications in a virtualized environment
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Automatic on-line failure diagnosis at the end-user site
HotDep'06 Proceedings of the Second conference on Hot topics in system dependability
Friday: global comprehension for distributed replay
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Toward generating reducible replay logs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Context-based online configuration-error detection
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
OFRewind: enabling record and replay troubleshooting for networks
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Partial replay of long-running applications
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
URDB: a universal reversible debugger based on decomposing debugging histories
PLOS '11 Proceedings of the 6th Workshop on Programming Languages and Operating Systems
SPARC: a security and privacy aware virtual machinecheckpointing mechanism
Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
REASSURE: a self-contained mechanism for healing software using rescue points
IWSEC'11 Proceedings of the 6th International conference on Advances in information and computer security
A virtual memory foundation for scalable deterministic parallelism
Proceedings of the Second Asia-Pacific Workshop on Systems
DoublePlay: Parallelizing Sequential Logging and Replay
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Improving Software Diagnosability via Log Enhancement
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Monitoring student progress using virtual appliances: A case study
Computers & Education
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Can deterministic replay be an enabling tool for mobile computing?
Proceedings of the 12th Workshop on Mobile Computing Systems and Applications
Chimera: hybrid program analysis for determinism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Compiler support for fine-grain software-only checkpointing
CC'12 Proceedings of the 21st international conference on Compiler Construction
Speculative Memory State Transfer for Active-Active Fault Tolerance
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Proceedings of the 39th Annual International Symposium on Computer Architecture
Stride: search-based deterministic replay in polynomial time via bounded linkage
Proceedings of the 34th International Conference on Software Engineering
AutoDunt: dynamic latent dependence analysis for detection of zero day vulnerability
ICISC'11 Proceedings of the 14th international conference on Information Security and Cryptology
Self-healing multitier architectures using cascading rescue points
Proceedings of the 28th Annual Computer Security Applications Conference
Scalable deterministic replay in a parallel full-system emulator
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
ConAir: featherweight concurrency bug recovery via single-threaded idempotent execution
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Cyrus: unintrusive application-level record-replay for replay parallelism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Composing OS extensions safely and efficiently with Bascule
Proceedings of the 8th ACM European Conference on Computer Systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
Expositor: scriptable time-travel debugging with first-class traces
Proceedings of the 2013 International Conference on Software Engineering
Interactive record/replay for web application debugging
Proceedings of the 26th annual ACM symposium on User interface software and technology
Techniques for efficient in-memory checkpointing
Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
Semi-automated debugging via binary search through a process lifetime
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
Direct code execution: revisiting library OS architecture for reproducible network experiments
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
DEFINED: deterministic execution for interactive control-plane debugging
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Underprovisioning backup power infrastructure for datacenters
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Leveraging the short-term memory of hardware to diagnose production-run software failures
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
RelaxReplay: record and replay for relaxed-consistency multiprocessors
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Operating systems are difficult to debug with traditional cyclic debugging. They are non-deterministic; they run for long periods of time; they interact directly with hard-ware devices; and their state is easily perturbed by the act of debugging. This paper describes a time-traveling virtual machine that overcomes many of the difficulties associated with debugging operating systems. Time travel enables a programmer to navigate backward and forward arbitrarily through the execution history of a particular run and to replay arbitrary segments of the past execution. We integrate time travel into a general-purpose debugger to enable a programmer to debug an OS in reverse, implementing commands such as reverse breakpoint, reverse watchpoint, and reverse single step. The space and time overheads needed to support time travel are reasonable for debugging, and movements in time are fast enough to support interactive debugging. We demonstrate the value of our time-traveling virtual machine by using it to understand and fix several OS bugs that are difficult to find with standard debugging tools. Reverse debugging is especially helpful in finding bugs that are fragile due to non-determinism, bugs in device drivers, bugs that require long runs to trigger, bugs that corrupt the stack, and bugs that are detected after the relevant stack frame is popped.