Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Parallel programming with MPI
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS)
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Dynamic software testing of MPI applications with umpire
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI: The Complete Reference
SKaMPI: A Detailed, Accurate MPI Benchmark
Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Deadlock Analysis of Synchronous Message-Passing Programs
PDSE '99 Proceedings of the International Symposium on Software Engineering for Parallel and Distributed Systems
Debugging Parallel Programs Using Incomplete Information
IWCC '99 Proceedings of the 1st IEEE Computer Society International Workshop on Cluster Computing
MPIWiz: subgroup reproducible replay of mpi applications
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
A graph based approach for MPI deadlock detection
Proceedings of the 23rd international conference on Supercomputing
Probabilistic diagnosis of performance faults in large-scale parallel applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Distributed wait state tracking for runtime MPI deadlock detection
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Many parallel programs have been developed that use message passing for communication. This leads to efficient and portable programs, but their complexity makes them hard to debug. One of the common problems in such programs is the detection of deadlocks. A deadlock detector, MPIDD, has been developed for dynamically detecting deadlocks in parallel programs that are written using C++ and MPI. The detection code for most of the blocking and non-blocking point-to-point and collective routines has been implemented. The code has been tested against an extensive test suite, application programs, and some publicly available benchmarks. The detector takes advantage of the MPI's profiling layer, requires no significant modification of user's code, and incurs very little overhead when invoked. Portability of the detector code is also a key advantage.