ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Cheap hardware support for software debugging and profiling
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
The duality of memory and communication in the implementation of a multiprocessor operating system
SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
IGOR: a system for program debugging via reversible execution
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
A software instruction counter
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Demonic memory for process histories
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
ACM Computing Surveys (CSUR)
Real-time, concurrent checkpoint for parallel programs
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
A bibliography of parallel debuggers, 1990 edition
ACM SIGPLAN Notices
Debuggable concurrency extensions for standard ML
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Source level debugging of automatically parallelized code
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Hardware-assisted replay of multiprocessor programs
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Design and Evaluation of the Rollback Chip: Special Purpose Hardware for Time Warp
IEEE Transactions on Computers
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
History cache: hardware support for reverse execution
ACM SIGARCH Computer Architecture News
Replay for concurrent non-deterministic shared-memory applications
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Efficient algorithms for bidirectional debugging
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Wanted: an application aware checkpointing service
EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
A Perturbation-Free Replay Platform for Cross-Optimized Multithreaded Applications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Shortcut Replay: A Replay Technique for Debugging Long-Running Parallel Programs
ASIAN '02 Proceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster
Assembly instruction level reverse execution for debugging
ACM Transactions on Software Engineering and Methodology (TOSEM)
Jockey: a user-space library for record-replay debugging
Proceedings of the sixth international symposium on Automated analysis-driven debugging
Replay compilation: improving debuggability of a just-in-time compiler
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Dynamic slicing long running programs through execution fast forwarding
Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Libckpt: transparent checkpointing under Unix
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Enabling tracing Of long-running multithreaded programs via dynamic execution reduction
Proceedings of the 2007 international symposium on Software testing and analysis
Efficient checkpointing of java software using context-sensitive capture and replay
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
JVM Independent Replay in Java
Electronic Notes in Theoretical Computer Science (ENTCS)
Analyzing multicore dumps to facilitate concurrency bug reproduction
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Robust non-intrusive record-replay with processor extraction
Proceedings of the 8th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging
Checkpoint/restart-enabled parallel debugging
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Recent advances in checkpoint/recovery systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Proceedings of the 40th Annual International Symposium on Computer Architecture
Accelerating incremental checkpointing for extreme-scale computing
Future Generation Computer Systems
Hi-index | 0.00 |
Parallel programs are difficult to debug because they run for a, long time and two executions may yield different results. Reverse execution, is a simple and powerful concept that solves both these problems. We are designing a tool for debugging parallel programs, called Recap, that provides the illusion of reverse execution using checkpoints and event recording and playback. During normal execution, Recap logs the results of system calls and shared memory reads: as well as the times that asynchronous events (signals) occur. Recap periodically checkpoints the state of a process by forking and suspending a new process. To reverse execute to a certain point in time, Recap continues the nearest checkpoint process forward in a self-contained environment, simulating all events using the log. We are implementing Recap as part of a larger environment for parallel program development.