Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Systematic macrostep debugging of message passing parallel programs
Future Generation Computer Systems - Special issue on distributed and parallel systems
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Distributed Slicing and Partial Re-execution for Distributed Programs
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
NOPE: A Nondeterministic Program Evaluator
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
ICSE '81 Proceedings of the 5th international conference on Software engineering
Extending a traditional debugger to debug massively parallel applications
Journal of Parallel and Distributed Computing
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
Debugging is a crucial part of the software development process. Especially massively-parallel programs impose huge difficulties to program analyis and debugging due to their higher complexity compared to sequential programs. For debugging and analysing parallel programs there are several tools available, but many of these fail in case of massively-parallel programs with potentially thousands of processes. In this work we introduce the single process debugging strategy, a scalable debugging strategy for massively-parallel programs. The goal of this strategy is to make debugging large scale programs as simple and straight-forward as debugging sequential programs. This is achieved by adapting and combining several techniques which are well known from sequential debugging. In combination, these techniques give the user the possibility to execute and investigate small fractions of a possibly huge parallel program, without having to (re-)execute the entire program.