On Properties of RDT Communication-Induced Checkpointing Protocols
IEEE Transactions on Parallel and Distributed Systems
Using Consistent Global Checkpoints to Synchronize Processes in Distributed Simulation
DS-RT '05 Proceedings of the 9th IEEE International Symposium on Distributed Simulation and Real-Time Applications
Hi-index | 0.00 |
A checkpointing protocol that enforces rollback-dependency trackability (RDT) during the progress of a distributed computation must induce processes to take forced checkpoints to avoid the formation of non-trackable rollback dependencies. A protocol based on the minimal characterization of RDT tests only the smallest set of non-trackable dependencies. The literature indicated that this approach would require the processes to maintain and propagate O(n^2) control information, where n is the number of processes in the computation. In this paper, we present a protocol that implements this approach using only O(n) control information.