SecondSite: disaster tolerance as a service

  • Authors:
  • Shriram Rajagopalan;Brendan Cully;Ryan O'Connor;Andrew Warfield

  • Affiliations:
  • University of British Columbia, Vancouver, BC, Canada;University of British Columbia, Vancouver, BC, Canada;University of British Columbia, Vancouver, BC, Canada;University of British Columbia, Vancouver, BC, Canada

  • Venue:
  • VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the design and implementation of SecondSite, a cloud-based service for disaster tolerance. SecondSite extends the Remus virtualization-based high availability system by allowing groups of virtual machines to be replicated across data centers over wide-area Internet links. The goal of the system is to commodify the property of availability, exposing it as a simple tick box when configuring a new virtual machine. To achieve this in the wide area, we have had to tackle the related issues of replication traffic bandwidth, reliable failure detection across geographic regions and traffic redirection over a wide-area network without compromising on transparency and consistency.