Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
A software instruction counter
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
High-level debugging in parasight
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Analyzing parallel program executions using multiple views
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Global conditions in debugging distributed programs
Journal of Parallel and Distributed Computing
Optimal tracing and replay for debugging message-passing parallel programs
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The visualization of parallel systems: an overview
Journal of Parallel and Distributed Computing - Special issue on tools and methods for visualization of parallel systems and computations
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
The Ariadne debugger: scalable application of event-based abstraction
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Panorama: a portable, extensible parallel debugger
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Software—Practice & Experience
Event and state-based debugging in TAU: a prototype
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Debugging race conditions in message-passing programs
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Debugging heterogeneous applications with Pangaea
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
The p2d2 project: building a portable distributed debugger
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Using execution trace data to improve distribute systems
Software—Practice & Experience
Distributed debugging for mobile networks
Journal of Systems and Software
Hi-index | 0.00 |
In this paper we report on features added to a parallel debugger to simplify the debugging of message passing programs. These features include replay, setting consistent breakpoints based on interprocess event causality, a parallel undo operation, and communication supervision. These features all use trace information collected during the execution of the program being debugged. We used a number of different instrumentation techniques to collect traces. We also implemented trace displays using two different trace visualization systems. The implementation was tested on an SGI Power Challenge cluster and a network of SGI workstations.