Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Debugging distributed programs using controlled re-execution
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Cross-entropy and rare events for maximal cut and partition problems
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue: Rare event simulation
Replay and Testing for Concurrent Programs
IEEE Software
A concurrent program debugging environment using real-time replay
ICPADS '97 Proceedings of the 1997 International Conference on Parallel and Distributed Systems
The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics)
Proceedings of the 34th conference on Winter simulation: exploring new frontiers
FMCAD '07 Proceedings of the Formal Methods in Computer Aided Design
Efficient dependency tracking for relevant events in concurrent systems
Distributed Computing
Multithreaded java program test generation
IBM Systems Journal
CUTE and jCUTE: concolic unit testing and explicit path model-checking tools
CAV'06 Proceedings of the 18th international conference on Computer Aided Verification
Convergence properties of the cross-entropy method for discrete optimization
Operations Research Letters
Falsification of temporal properties of hybrid systems using the cross-entropy method
Proceedings of the 15th ACM international conference on Hybrid Systems: Computation and Control
Using cross-entropy for satisfiability
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Finding rare numerical stability errors in concurrent computations
Proceedings of the 2013 International Symposium on Software Testing and Analysis
Hi-index | 0.00 |
Replay is an important technique in program analysis, allowing to reproduce bugs, to track changes, and to repeat executions for better understanding of the results. Unfortunately, since re-executing a concurrent program does not necessarily produce the same ordering of events, replay of such programs becomes a difficult task. The most common approach to replay of concurrent programs is based on analyzing the logical dependencies among concurrent events and requires a complete recording of the execution we are trying to replay as well as a complete control over the program's scheduler. In realistic settings, we usually have only a partial recording of the execution and only partial control over the scheduling decisions, thus such an analysis is often impossible. In this paper, we present an approach for replay in the presence of partial information and partial control. Our approach is based on a novel application of the cross-entropy method, and it does not require any logical analysis of dependencies among concurrent events. Roughly speaking, given a partial recording R of an execution, we define a performance function on executions, which reaches its maximum on R (or any other execution that coincides with R on the recorded events). Then, the program is executed many times in iterations, on each iteration adjusting the probabilistic scheduling decisions so that the performance function is maximized. Our method is also applicable to debugging of concurrent programs, in which the program is changed before it replayed in order to increase the information from its execution. We implemented our replay method on concurrent Java programs and we show that it consistently achieves a close replay in presence of incomplete information and incomplete control, as well as when the program is changed before it is replayed.