A Formal Model of Crash Recovery in a Distributed System

  • Authors:
  • D. Skeen;M. Stonebraker

  • Affiliations:
  • Department of Computer Science, Cornell University;-

  • Venue:
  • IEEE Transactions on Software Engineering
  • Year:
  • 1983

Quantified Score

Hi-index 0.01

Visualization

Abstract

A formal model for atomic commit protocols for a distributed database system is introduced. The model is used to prove existence results about resilient protocols for site failures that do not partition the network and then for partitioned networks. For site failures, a pessimistic recovery technique, called independent recovery, is introduced and the class of failures for which resilient protocols exist is identified. For partitioned networks, two cases are studied: the pessimistic case in which messages are lost, and the optimistic case in which no messages are lost. In all cases, fundamental limitations on the resiliency of protocols are derived.