Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
IGOR: a system for program debugging via reversible execution
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Supporting reverse execution for parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Debugging of heterogeneous parallel systems
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Hardware-assisted replay of multiprocessor programs
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Replay for concurrent non-deterministic shared-memory applications
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Deciding when to forget in the Elephant file system
Proceedings of the seventeenth ACM symposium on Operating systems principles
Efficient algorithms for bidirectional debugging
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Communications of the ACM
Reversible Debugging Using Program Instrumentation
IEEE Transactions on Software Engineering
An Execution-Backtracking Approach to Debugging
IEEE Software
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
A regulated transitive reduction (RTR) for longer memory race recording
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Recording shared memory dependencies using strata
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
BulkSC: bulk enforcement of sequential consistency
Proceedings of the 34th annual international symposium on Computer architecture
File system design for an NFS file server appliance
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Ripley: automatically securing web 2.0 applications through replicated execution
Proceedings of the 16th ACM conference on Computer and communications security
Offline symbolic analysis for multi-processor execution replay
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Architecting a chunk-based memory race recorder in modern CMPs
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Analyzing multicore dumps to facilitate concurrency bug reproduction
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Alhambra: a system for creating, enforcing, and testing browser security policies
Proceedings of the 19th international conference on World wide web
PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
LReplay: a pending period based deterministic replay scheme
Proceedings of the 37th annual international symposium on Computer architecture
LEAP: lightweight deterministic multi-processor replay of concurrent java programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Paranoid Android: versatile protection for smartphones
Proceedings of the 26th Annual Computer Security Applications Conference
Focus replay debugging effort on the control plane
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
Bypassing races in live applications with execution filters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
InstantCheck: Checking the Determinism of Parallel Programs Using On-the-Fly Incremental Hashing
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Toward generating reducible replay logs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
OFRewind: enabling record and replay troubleshooting for networks
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
ORDER: object centric deterministic replay for Java
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Efficient deterministic multithreading through schedule relaxation
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Accentuating the positive: atomicity inference and enforcement using correct executions
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
CoreRacer: a practical memory race recorder for multicore x86 TSO processors
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Exploiting parallelism in deterministic shared memory multiprocessing
Journal of Parallel and Distributed Computing
Yada: Straightforward parallel programming
Parallel Computing
TACHYON: tandem execution for efficient live patch testing
Security'12 Proceedings of the 21st USENIX conference on Security symposium
LEAN: simplifying concurrency bug reproduction via replay-supported execution reduction
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Deterministic Replay Using Global Clock
ACM Transactions on Architecture and Code Optimization (TACO)
Cyrus: unintrusive application-level record-replay for replay parallelism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
Concurrent predicates: a debugging technique for every parallel programmer
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Semi-automated debugging via binary search through a process lifetime
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
RelaxReplay: record and replay for relaxed-consistency multiprocessors
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
While deterministic replay of parallel programs is a powerful technique, current proposals have shortcomings. Specifically, software-based replay systems have high overheads on multiprocessors, while hardware-based proposals focus only on basic hardware-level mechanisms, ignoring the overall replay system. To be practical, hardware-based replay systems need to support an environment with multiple parallel jobs running concurrently -- some being recorded, others being replayed and even others running without recording or replay. Moreover, they need to manage limited-size log buffers. This paper addresses these shortcomings by introducing, for the first time, a set of abstractions and a software-hardware interface for practical hardware-assisted replay of multiprocessor systems. The approach, called Capo, introduces the novel abstraction of the Replay Sphere to separate the responsibilities of the hardware and software components of the replay system. In this paper, we also design and build CapoOne, a prototype of a deterministic multiprocessor replay system that implements Capo using Linux and simulated DeLorean hardware. Our evaluation of 4-processor executions shows that CapoOne largely records with the efficiency of hardware-based schemes and the flexibility of software-based schemes.