Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Hardware-assisted replay of multiprocessor programs
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Hypervisor-based fault tolerance
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Adaptive Message Logging for Incremental Program Replay
IEEE Parallel & Distributed Technology: Systems & Technology
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
RDB: A System for Incremental Replay Debugging
RDB: A System for Incremental Replay Debugging
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
A regulated transitive reduction (RTR) for longer memory race recording
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Two hardware-based approaches for deterministic multiprocessor replay
Communications of the ACM - One Laptop Per Child: Vision vs. Reality
Transparent checkpoints of closed distributed systems in Emulab
Proceedings of the 4th ACM European conference on Computer systems
Proceedings of the 4th International Symposium on Information, Computer, and Communications Security
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
SherLog: error diagnosis by connecting clues from run-time logs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Analyzing multicore dumps to facilitate concurrency bug reproduction
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
A randomized scheduler with probabilistic guarantees of finding bugs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Kivati: fast detection and prevention of atomicity violations
Proceedings of the 5th European conference on Computer systems
Execution synthesis: a technique for automated software debugging
Proceedings of the 5th European conference on Computer systems
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Testing closed-source binary device drivers with DDT
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Determinating timing channels in compute clouds
Proceedings of the 2010 ACM workshop on Cloud computing security workshop
Instrumentation and sampling strategies for cooperative concurrency bug isolation
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
LEAP: lightweight deterministic multi-processor replay of concurrent java programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Paranoid Android: versatile protection for smartphones
Proceedings of the 26th Annual Computer Security Applications Conference
Focus replay debugging effort on the control plane
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Bypassing races in live applications with execution filters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
COREMU: a scalable and portable parallel full-system emulator
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
2ndStrike: toward manifesting hidden concurrency typestate bugs
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Debug determinism: the sweet spot for replay-based debugging
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Toward generating reducible replay logs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
ORDER: object centric deterministic replay for Java
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Efficient deterministic multithreading through schedule relaxation
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Pervasive detection of process races in deployed systems
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
SPARC: a security and privacy aware virtual machinecheckpointing mechanism
Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
Accentuating the positive: atomicity inference and enforcement using correct executions
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
ReNIC: Architectural extension to SR-IOV I/O virtualization for efficient replication
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
DoublePlay: Parallelizing Sequential Logging and Replay
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Improving Software Diagnosability via Log Enhancement
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
WODA '09 Proceedings of the Seventh International Workshop on Dynamic Analysis
Aikido: accelerating shared data dynamic analyses
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Enhancing TCP throughput of highly available virtual machines via speculative communication
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
SecondSite: disaster tolerance as a service
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
CoreRacer: a practical memory race recorder for multicore x86 TSO processors
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient system-enforced deterministic parallelism
Communications of the ACM
Can deterministic replay be an enabling tool for mobile computing?
Proceedings of the 12th Workshop on Mobile Computing Systems and Applications
Chimera: hybrid program analysis for determinism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
End-to-end sequential consistency
Proceedings of the 39th Annual International Symposium on Computer Architecture
CCTR: An efficient point-to-point memory race recorder implemented in chunks
Microprocessors & Microsystems
LEAN: simplifying concurrency bug reproduction via replay-supported execution reduction
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
All about Eve: execute-verify replication for multi-core servers
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
RemusDB: transparent high availability for database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Future Generation Computer Systems
Scalable deterministic replay in a parallel full-system emulator
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Deterministic Replay Using Global Clock
ACM Transactions on Architecture and Code Optimization (TACO)
Transparent mutable replay for multicore debugging and patch validation
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Cyrus: unintrusive application-level record-replay for replay parallelism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
DDOS: taming nondeterminism in distributed systems
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
Automated debugging for arbitrarily long executions
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
A survey of migration mechanisms of virtual machines
ACM Computing Surveys (CSUR)
COLO: COarse-grained LOck-stepping virtual machines for non-stop service
Proceedings of the 4th annual Symposium on Cloud Computing
Pico replication: a high availability framework for middleboxes
Proceedings of the 4th annual Symposium on Cloud Computing
Semi-automated debugging via binary search through a process lifetime
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
MiG: efficient migration of desktop VMs using semantic compression
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Hi-index | 0.02 |
Execution replay of virtual machines is a technique which has many important applications, including debugging, fault-tolerance, and security. Execution replay for single processor virtual machines is well-understood, and available commercially. With the advancement of multi-core architectures, however, multiprocessor virtual machines are becoming more important. Our system, SMP-ReVirt, is the first system to log and replay a multiprocessor virtual machine on commodity hardware. We use hardware page protection to detect and accurately replay sharing between virtual cpus of a multi-cpu virtual machine, allowing us to replay the entire operating system and all applications. We have tested our system on a variety of workloads, and find that although sharing under SMP-ReVirt is expensive, for many workloads and applications, including debugging, the overhead is acceptable.