A performance study of three high availability data replication strategies

  • Authors:
  • Hui-I Hsiao;David J. DeWitt

  • Affiliations:
  • -;-

  • Venue:
  • PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
  • Year:
  • 1991

Quantified Score

Hi-index 0.01

Visualization

Abstract

Several data replication strategies have been proposed to provide high data availability for database applications. However, the tradeoffs among the different strategies for various workloads and different operating modes have not been studied before. In this paper, we study the relative performance of chained declustering, mirrored disks, and interleaved declustering, in a shared nothing database machine environment. In particular, we have examined the relative performance of the three strategies when no failures have occurred, the effect of load imbalance caused by a disk or processor failure on system throughput and response time, and the tradeoff between the benefit of intra query parallelism and process scheduling overhead.Experimental results obtained from a simulation study indicates that, in the normal mode of operation, chained declustering and interleaved declustering perform comparably. Both perform better than mirrored disks if an application is I/O bound, but slightly worse than mirrored disks if the application is CPU bound. In the event of a disk failure, because chained declustering is able to balance the workload among all remaining operational disks while the other two cannot, it provides noticeably better performance than interleaved declustering and much better performance than mirrored disks.