Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
ACM Computing Surveys (CSUR)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Hi-index | 0.00 |
Cyclic debugging is one of the most important and most commonly used activities in programs development. During cyclic debugging, the program is repeatedly re-executed to track down errors when a failure has been observed. The cyclic debugging approach often fails for parallel programs because parallel programs reveal nondeterministic characteristics due to message race conditions. Execution replay is a technique developed to facilitate the debugging of nondeterministic programs. The trace file can be used to force the replay of the parallel program with the same input. The size of trace file is very important to evaluate the scalability of record&replay scheme. This paper proposes an improved clock system, called 1-n clock. By combining local logic clock and vector clock, 1-n clock can compress the size of trace file. This method especially supports record&replay of long-running parallel programs.