IGOR: a system for program debugging via reversible execution
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Techniques for debugging parallel programs with flowback analysis
ACM Transactions on Programming Languages and Systems (TOPLAS)
Panorama: a portable, extensible parallel debugger
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Optimal tracing and incremental reexecution for debugging long-running programs
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
The p2d2 project: building a portable distributed debugger
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
An experiment in tool integration: the DDBG parallel and distributed debugger
Journal of Systems Architecture: the EUROMICRO Journal
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
An Execution-Backtracking Approach to Debugging
IEEE Software
An Efficient Logging Algorithm for Incremental Replay of Message
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
The Design of the General Parallel Monitoring System
Proceedings of the IFIP WG 10.3 Workshop on Programming Environments for Parallel Computing
Teraflops Computing: A Challenge to Parallel Numerics?
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An Overview of Checkpointing in Uniprocessor and DistributedSystems, Focusing on Implementation and Performance
EXDAMS: extendable debugging and monitoring system
AFIPS '69 (Spring) Proceedings of the May 14-16, 1969, spring joint computer conference
Hi-index | 0.00 |
Cyclic debugging depicts error detection techniques, where programs are iteratively executed to identify the original reason for incorrect runtime behavior. This characteristic is especially problematic for large-scale, long-running parallel programs concerning the requirements in time and processing resources and the associated computing costs. A solution to these problems is offered by a combination of techniques, which use the event graph model as the main representation of parallel program behavior. On the one hand, the number of deployed processes can be reduced with process isolation, where only a subset of the original processes are executed during debugging. On the other hand, an integrated checkpointing mechanism allows to extract limited periods of execution time, or to start subsequent program executions at intermediate points. Additionally, the event graph offers equivalent program execution in case of nondeterminism, as well as the possibility to investigate the effects of program perturbation induced by the observation functionality.