Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
A mechanism for efficient debugging of parallel programs
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
What are race conditions?: Some issues and formalizations
ACM Letters on Programming Languages and Systems (LOPLAS)
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Debugging with the MAD environment
Parallel Computing - Special double issue on environment and tools for parallel scientific computing
Systematic macrostep debugging of message passing parallel programs
Future Generation Computer Systems - Special issue on distributed and parallel systems
Parallel programming in OpenMP
Parallel programming in OpenMP
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Journal of Parallel and Distributed Computing
Foundations of Parallel Programming: A Machine-Indepedent Approach
Foundations of Parallel Programming: A Machine-Indepedent Approach
Fundamentals of Distributed System Observation
IEEE Software
The Design of the General Parallel Monitoring System
Proceedings of the IFIP WG 10.3 Workshop on Programming Environments for Parallel Computing
On the Implementation of a Reply Mechanism
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
Hi-index | 0.00 |
Building powerful machines is only one part of high performance computing. Obviously, such supercomputers must be programmed efficiently to obtain the desired performance. This task isdifficult and time consuming due to huge amounts of data being processed and critical anomalies like deadlocks and race conditions.This paper focuses on race conditions in shared-memory programs, which are introduced due to nondeterministic behavior at synchronization or communication operations. Such programs may yield different results, even if the same input data is provided. This complicates testing anddebugging, where techniques for re-executingand controlling the nondeterminism of such programsare needed. Sucha sophisticated technique is event manipulation, whichallows to steer race conditions in parallel programs. While originally applied to message-passing programs, the latest event manipulation approach addresses OpenMP shared-memoryprograms. This paper describes the principal idea of shared-memory event manipulation and demonstrates its application for a simple mutual exclusion example.