Debugging race conditions in message-passing programs
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Debugging distributed programs using controlled re-execution
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Parallel digital halftoning by error-diffusion
PCK50 Proceedings of the Paris C. Kanellakis memorial workshop on Principles of computing & knowledge: Paris C. Kanellakis memorial workshop on the occasion of his 50th birthday
DPS " Dynamic Parallel Schedules
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Detecting the First Races in Parallel Programs with Ordered Synchronization
ICPADS '98 Proceedings of the 1998 International Conference on Parallel and Distributed Systems
Replay for Debugging MPI Parallel Programs
MPIDC '96 Proceedings of the Second MPI Developers Conference
Re-execution of Distributed Programs to Detect Bugs Hidden by Racing
HICSS '97 Proceedings of the 30th Hawaii International Conference on System Sciences: Software Technology and Architecture - Volume 1
KISS: keep it simple and sequential
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Checking Global Properties for Local Computations in Graphs with Applications to Invariant Testing
ENC '04 Proceedings of the Fifth Mexican International Conference in Computer Science
Dynamic partial-order reduction for model checking software
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 16 - Volume 17
Modeling wildcard-free MPI programs for verification
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Exploiting Purity for Atomicity
IEEE Transactions on Software Engineering
Runtime Analysis of Atomicity for Multithreaded Programs
IEEE Transactions on Software Engineering
ISP: a tool for model checking MPI programs
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Parametric and sliced causality
CAV'07 Proceedings of the 19th international conference on Computer aided verification
Retrospect: deterministic replay of MPI applications for interactive distributed debugging
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Verification of halting properties for MPI programs using nonblocking operations
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
In message-passing parallel applications, messages are not delivered in a strict order. The number of messages, their content and their destination may depend on the ordering of their delivery. Nevertheless, for most applications, the computation results should be the same for all possible orderings. Finding an ordering that produces a different outcome or that prevents the execution from terminating reveals a message race or a deadlock. Starting from the initial application state, we dynamically build an acyclic message-passing state graph such that each path within the graph represents one possible message ordering. All paths lead to the same final state if no deadlock or message race exists. If multiple final states are reached, we reveal message orderings that produce the different outcomes. The corresponding executions may then be replayed for debugging purposes. We reduce the number of states to be explored by using previously acquired knowledge about communication patterns and about how operations read and modify local process variables. We also describe a heuristic that tests a subset of orderings that are likely to reveal existing message races or deadlocks. We applied our approach on several applications developed using the Dynamic Parallel Schedules (DPS) parallelization framework. Compared to the naive execution of all message orderings, the use of a message-passing state graph reduces the cost of testing all orderings by several orders of magnitude. The use of prior information further reduces the number of visited states by a factor of up to fifty in our tests. The heuristic relying on a subset of orderings was able to reveal race conditions in all tested cases. We finally present a first step in generalizing the approach to MPI applications.