The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
RecPlay: a fully integrated practical record/replay system
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Symbolic execution and program testing
Communications of the ACM
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
TraceBack: first fault diagnosis by reconstruction of distributed control flow
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Jockey: a user-space library for record-replay debugging
Proceedings of the sixth international symposium on Automated analysis-driven debugging
EXE: automatically generating inputs of death
Proceedings of the 13th ACM conference on Computer and communications security
Computer Architecture, Fourth Edition: A Quantitative Approach
Computer Architecture, Fourth Edition: A Quantitative Approach
Framework for instruction-level tracing and analysis of program executions
Proceedings of the 2nd international conference on Virtual execution environments
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Replay debugging for distributed applications
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Relaxed determinism: making redundant execution on multiprocessors practical
HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
A decision procedure for bit-vectors and arrays
CAV'07 Proceedings of the 19th international conference on Computer aided verification
R2: an application-level kernel for record and replay
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Analyzing multicore dumps to facilitate concurrency bug reproduction
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Execution synthesis: a technique for automated software debugging
Proceedings of the 5th European conference on Computer systems
LReplay: a pending period based deterministic replay scheme
Proceedings of the 37th annual international symposium on Computer architecture
A trace simplification technique for effective debugging of concurrent programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
LEAP: lightweight deterministic multi-processor replay of concurrent java programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Dynamic verification of hybrid programs
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Focus replay debugging effort on the control plane
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
Bypassing races in live applications with execution filters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Automating configuration troubleshooting with dynamic information flow analysis
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
InstantCheck: Checking the Determinism of Parallel Programs Using On-the-Fly Incremental Hashing
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Low-overhead bug fingerprinting for fast debugging
RV'10 Proceedings of the First international conference on Runtime verification
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
RCDC: a relaxed consistency deterministic computer
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
ConSeq: detecting concurrency bugs through sequential errors
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the sixth conference on Computer systems
Striking a new balance between program instrumentation and debugging time
Proceedings of the sixth conference on Computer systems
Dependence-based multi-level tracing and replay for wireless sensor networks debugging
Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Debug determinism: the sweet spot for replay-based debugging
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Making programs forget: enforcing lifetime for sensitive data
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Toward generating reducible replay logs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
RADBench: a concurrency bug benchmark suite
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
OFRewind: enabling record and replay troubleshooting for networks
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
ORDER: object centric deterministic replay for Java
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Partial replay of long-running applications
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Efficient deterministic multithreading through schedule relaxation
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Advances and challenges in log analysis
Communications of the ACM
Advances and Challenges in Log Analysis
Queue - Log Analysis
DoublePlay: Parallelizing Sequential Logging and Replay
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
SecondSite: disaster tolerance as a service
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
CoreRacer: a practical memory race recorder for multicore x86 TSO processors
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Exploiting parallelism in deterministic shared memory multiprocessing
Journal of Parallel and Distributed Computing
Can deterministic replay be an enabling tool for mobile computing?
Proceedings of the 12th Workshop on Mobile Computing Systems and Applications
CDE: run any Linux application on-demand without installation
LISA'11 Proceedings of the 25th international conference on Large Installation System Administration
Chimera: hybrid program analysis for determinism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Stride: search-based deterministic replay in polynomial time via bounded linkage
Proceedings of the 34th International Conference on Software Engineering
ShadowStream: performance evaluation as a capability in production internet live streaming networks
Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Tracing and recording interrupts in embedded software
Journal of Systems Architecture: the EUROMICRO Journal
ShadowStream: performance evaluation as a capability in production internet live streaming networks
ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
LEAN: simplifying concurrency bug reproduction via replay-supported execution reduction
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Automated concurrency-bug fixing
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
All about Eve: execute-verify replication for multi-core servers
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Be conservative: enhancing failure diagnosis with proactive logging
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
X-ray: automating root-cause diagnosis of performance anomalies in production software
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
DTAM: dynamic taint analysis of multi-threaded programs for relevancy
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
RemusDB: transparent high availability for database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Scalable deterministic replay in a parallel full-system emulator
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
RaceFree: an efficient multi-threading model for determinism
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Deterministic Replay Using Global Clock
ACM Transactions on Architecture and Code Optimization (TACO)
ConAir: featherweight concurrency bug recovery via single-threaded idempotent execution
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Transparent mutable replay for multicore debugging and patch validation
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Cyrus: unintrusive application-level record-replay for replay parallelism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Verifying systems rules using rule-directed symbolic execution
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
CONCURRIT: a domain specific language for reproducing concurrency bugs
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
CLAP: recording local executions to reproduce concurrency failures
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Automated debugging for arbitrarily long executions
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Value-deterministic search-based replay for android multithreaded applications
Proceedings of the 2013 Research in Adaptive and Convergent Systems
COLO: COarse-grained LOck-stepping virtual machines for non-stop service
Proceedings of the 4th annual Symposium on Cloud Computing
Towards effective and efficient search-based deterministic replay
Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
Semi-automated debugging via binary search through a process lifetime
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
Leveraging the short-term memory of hardware to diagnose production-run software failures
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Global property violation detection and diagnosis for wireless sensor networks
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
X10-FT: Transparent fault tolerance for APGAS language and runtime
Parallel Computing
Hi-index | 0.03 |
Reproducing bugs is hard. Deterministic replay systems address this problem by providing a high-fidelity replica of an original program run that can be repeatedly executed to zero-in on bugs. Unfortunately, existing replay systems for multiprocessor programs fall short. These systems either incur high overheads, rely on non-standard multiprocessor hardware, or fail to reliably reproduce executions. Their primary stumbling block is data races -- a source of nondeterminism that must be captured if executions are to be faithfully reproduced. In this paper, we present ODR--a software-only replay system that reproduces bugs and provides low-overhead multiprocessor recording. The key observation behind ODR is that, for debugging purposes, a replay system does not need to generate a high-fidelity replica of the original execution. Instead, it suffices to produce any execution that exhibits the same outputs as the original. Guided by this observation, ODR relaxes its fidelity guarantees to avoid the problem of reproducing data-races altogether. The result is a system that replays real multiprocessor applications, such as Apache, MySQL, and the Java Virtual Machine, and provides low record-mode overhead.