Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Memory resource management in VMware ESX server
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
A feather-weight virtual machine for windows applications
Proceedings of the 2nd international conference on Virtual execution environments
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Live migration of virtual machines
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Live wide-area migration of virtual machines including local persistent state
Proceedings of the 3rd international conference on Virtual execution environments
Proactive fault tolerance for HPC with Xen virtualization
Proceedings of the 21st annual international conference on Supercomputing
Remus: high availability via asynchronous virtual machine replication
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Hi-index | 0.00 |
Virtualization provides the possibility of whole machine migration and thus enables a new form of fault tolerance that is completely transparent to applications and operating systems. The most seamless virtualization-based fault tolerance configuration is an active/active master-slave configuration, in which the memory states of the master and slave virtual machine are periodically synchronized and the slave can immediately take over when the master dies without losing any on-going connections. The frequency of memory state synchronization has a direct impact on the performance overhead, the application response time, and the fail-over delay. This paper describes a speculative memory state synchronization technique that could effectively reduce the synchronization frequency without increasing the performance overhead, and presents a comprehensive performance study of these techniques under three realistic workloads, the TPC-E benchmark, the SPECsfs 2008 CIFS benchmark, and a Microsoft Exchange workload. We show that the proposed technique can effectively cut down the amount of memory state synchronization traffic by more than an order of magnitude.