Adaptive Message Logging for Incremental Program Replay

Authors:
Robert H. B. Netzer;Jian Xu
Affiliations:
-;-
Venue:
IEEE Parallel & Distributed Technology: Systems & Technology
Year:
1993

Citing 11
Cited 6

Optimistic recovery in distributed systems

ACM Transactions on Computer Systems (TOCS)
Debugging Parallel Programs with Instant Replay

IEEE Transactions on Computers
Recovery in distributed systems using asynchronous message logging and checkpointing

PODC '88 Proceedings of the seventh annual ACM Symposium on Principles of distributed computing
Debugging distributed C programs by real time reply

PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Partial orders for parallel debugging

PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Efficient execution replay technique for distributed memory architectures

EDMCC2 Proceedings of the 2nd European conference on Distributed memory computing
Restoring consistent global states of distributed computations

PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Optimal tracing and replay for debugging message-passing parallel programs

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Adaptive message logging for incremental replay of message-passing programs

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Distributed snapshots: determining global states of distributed systems

ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM

Support for Software Interrupts in Log-Based Rollback-Recovery

IEEE Transactions on Computers
Shortcut Replay: A Replay Technique for Debugging Long-Running Parallel Programs

ASIAN '02 Proceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster
Performing replay in an OSF DCE environment

CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
Supporting nondeterministic execution in fault-tolerant systems

FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Execution replay of multiprocessor virtual machines

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
ROS: the rollback-one-step method to minimize the waiting time during debugging long-running parallel programs

VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Adaptive message logging, which traces dependences between messages and checkpoints and selectively logs messages, letting users accurately and efficiently replay specific portions of parallel programs, is presented. Traces are reduced by logging only messages that cannot be quickly recomputed during replay. By restarting the execution at the right set of checkpoints, many of the messages needed for a specific replay can be recomputed during the replay itself.