Selective Checkpointing and Rollbacks in Multithreaded Distributed Systems

  • Authors:
  • Affiliations:
  • Venue:
  • ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: Modem distributed Systems are often multithreaded and object-oriented in their design. They require efficient techniques to checkpoint and restore their state for improving fault-tolerance properties. The traditional process-based techniques of distributed checkpointing and rollback algorithms suffer from the problem of false dependencies, which makes them very rigid and inefficient for use with modem systems. In this paper, we develop protocols that can selectively checkpoint (and rollback) some threads of a distributed system, while leaving others untouched and yet ensuring the consistency of state resulting from such a partial rollback.