Enhancing TCP throughput of highly available virtual machines via speculative communication

  • Authors:
  • Balazs Gerofi;Yutaka Ishikawa

  • Affiliations:
  • The University Of Tokyo, Tokyo, Japan;The University Of Tokyo, Tokyo, Japan

  • Venue:
  • VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
  • Year:
  • 2012
  • Paravirtualizing TCP

    Proceedings of the 6th international workshop on Virtualization Technologies in Distributed Computing Date

  • Streaming as a hypervisor service

    Proceedings of the 7th international workshop on Virtualization technologies in distributed computing

Quantified Score

Hi-index 0.00

Visualization

Abstract

Checkpoint-recovery based virtual machine (VM) replication is an attractive technique for accommodating VM installations with high-availability. It provides seamless failover for the entire software stack executed in the VM regardless the application or the underlying operating system (OS), it runs on commodity hardware, and it is inherently capable of dealing with shared memory non-determinism of symmetric multiprocessing (SMP) configurations. There have been several studies aiming at alleviating the overhead of replication, however, due to consistency requirements, network performance of the basic replication mechanism remains extremely poor., In this paper we revisit the replication protocol and extend it with speculative communication. Speculative communication silently acknowledges TCP packets of the VM, enabling the guest's TCP stack to progress with transmission without exposing the messages to the clients before the corresponding execution state is checkpointed to the backup host. Furthermore, we propose replication aware congestion control, an extension to the guest's TCP stack that aggressively fills up the VMM's replication buffer so that speculative packets can be backed up and released earlier to the clients. We observe up to an order of magnitude improvement in bulk data transfer with speculative communication, and close to native VM network performance when replication awareness is enabled in the guest OS. We provide results of micro-, as well as application-level benchmarks.