Enhancing the performance of high availability lightweight live migration

  • Authors:
  • Peng Lu;Binoy Ravindran;Changsoo Kim

  • Affiliations:
  • ECE Department, Virginia Tech;ECE Department, Virginia Tech;ETRI, Daejeon, South Korea

  • Venue:
  • OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Remus is one of the first systems which implemented whole virtual machine replication to achieve high availability (HA). Recently a fast, lightweight migration mechanism (LLM) was proposed to reduce the long network delay in Remus. However, these virtualized systems have the long downtime problem, which is a bottleneck to achieve HA. Based on LLM, in this paper, we describe a fine-grained block identification (or FGBI) mechanism to reduce the downtime in virtualized systems so as to achieve HA, with support for a block sharing mechanism and hybrid compression method. We implement the FGBI mechanism and evaluate it against LLM and Remus, using several benchmarks such as Apache, SPECweb, NPB and SPECsys. Our experimental results reveal that FGBI reduces the type I downtime over LLM and Remus by as much as 77% and 45% respectively, and reduces the type II downtime by more than 90% and more than 70%, compared with LLM and Remus respectively. Moreover, in all cases, the performance overhead of FGBI is less than 13%.