The Performance of Two Phase Commit Protocols in the Presence of Site Failures
Distributed and Parallel Databases
Hi-index | 0.00 |
We study the problem of recovery in large-scale transactionbased distributed systems with replicated data. In large distributed systems the cost of accessing data items may be considerably greater, because of the distances involved. It is thus important to exploit replication to reduce data-access times. Also, in large systems, failure events are much more frequent than in small systems. Therefore, executing costly recovery protocols, such as the ones needed to update stale, newly-recovered replicas or to resolve the uncertainty of recovering replicas, must be avoided. We call these protocols dependent recovery protocols since they require a recovering site to consult other sites before it can be reintegrated into the distributed system. Independent recovery has been proven unattainable in one-copy systems. Here we show that independent recovery is possible in systems with replicated data by contributing such a protocol. We also report on simulation and analytical studies examining its performance and availability characteristics.