Testing races in parallel programs with an OtOt strategy
ISSTA '94 Proceedings of the 1994 ACM SIGSOFT international symposium on Software testing and analysis
Optimal tracing and replay for debugging message-passing parallel programs
The Journal of Supercomputing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Systematic Assessment of the Overhead of Tracing Parallel Programs
PDP '96 Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96)
Comparison of Different Approaches to Trace PVM Program Execution
Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
The paper presents a definition of replay of a distributed application as a function of three parameters: depth, width, and length. It addresses the problem of nondeterminism in distributed system and proposes an efficient approach to trace a PVM application behaviour in order to eliminate races in repetited execution. Detecting races in distributed computations requires implementation of a strongly consistent system of vector clocks. Therefore a system of vector clocks was adapted for a dynamic application model. Finally it presents the architecture of a tool supporting replay of PVM applications.