Devirtualizable virtual machines enabling general, single-node, online maintenance

Authors:
David E. Lowell;Yasushi Saito;Eileen J. Samberg
Affiliations:
Hewlett-Packard Laboratories, Palo Alto, CA;Hewlett-Packard Laboratories, Palo Alto, CA;Hewlett-Packard Laboratories, Palo Alto, CA
Venue:
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Year:
2004

Citing 22
Cited 17

Multiple operating systems on one processor complex

IBM Systems Journal
Efficient software-based fault isolation

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Hypervisor-based fault tolerance

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Distributed and multiprocessor scheduling

ACM Computing Surveys (CSUR)
The performance of μ-kernel-based systems

Proceedings of the sixteenth ACM symposium on Operating systems principles
Disco: running commodity operating systems on scalable multiprocessors

Proceedings of the sixteenth ACM symposium on Operating systems principles
Fast cluster failover using virtual memory-mapped communication

ICS '99 Proceedings of the 13th international conference on Supercomputing
Cellular Disco: resource management using virtual clusters on shared-memory multiprocessors

Proceedings of the seventeenth ACM symposium on Operating systems principles
Dynamic software updating

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
On-the-Fly Program Modification: Systems for Dynamic Updating

IEEE Software
Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor

Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,

Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,
Xen and the art of virtualization

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Enabling autonomic behavior in systems software with hot swapping

IBM Systems Journal
A Simple Way to Estimate the Cost of Downtime

LISA '02 Proceedings of the 16th USENIX conference on System administration
Virtual Appliances for Deploying and Maintaining Software

LISA '03 Proceedings of the 17th USENIX conference on System administration
Memory resource management in VMware ESX server

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Scale and performance in the Denali isolation kernel

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
The design and implementation of Zap: a system for migrating computing environments

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Optimizing the migration of virtual computers

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Analysis of the Intel Pentium's ability to support a secure virtual machine monitor

SSYM'00 Proceedings of the 9th conference on USENIX Security Symposium - Volume 9
NT-SwiFT: software implemented fault tolerance on windows NT

WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2

SoftUDC: A Software-Based Data Center for Utility Computing

Computer
Planning for code buffer management in distributed virtual execution environments

Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
PDS: a virtual execution environment for software deployment

Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
Practical dynamic software updating for C

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
POLUS: A POwerful Live Updating System

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Performance and security lessons learned from virtualizing the alpha processor

Proceedings of the 34th annual international symposium on Computer architecture
Reducing downtime due to system maintenance and upgrades

LISA '05 Proceedings of the 19th conference on Large Installation System Administration Conference - Volume 19
Understanding and dealing with operator mistakes in internet services

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
BitVisor: a thin hypervisor for enforcing i/o device security

Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Ksplice: automatic rebootless kernel updates

Proceedings of the 4th ACM European conference on Computer systems
Why do upgrades fail and what can we do about it?: toward dependable, online upgrades in enterprise system

Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
Why do upgrades fail and what can we do about it?: toward dependable, online upgrades in enterprise system

Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
The best of both worlds with on-demand virtualization

HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Eliminating the hypervisor attack surface for a more secure cloud

Proceedings of the 18th ACM conference on Computer and communications security
Traveling forward in time to newer operating systems using ShadowReboot

Proceedings of the Second Asia-Pacific Workshop on Systems
Safe and automatic live update for operating systems

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Traveling forward in time to newer operating systems using ShadowReboot

Proceedings of the 9th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

Maintenance is the dominant source of downtime at high availability sites. Unfortunately, the dominant mechanism for reducing this downtime, cluster rolling upgrade, has two shortcomings that have prevented its broad acceptance. First, cluster-style maintenance over many nodes is typically performed a few nodes at a time, mak-ing maintenance slow and often impractical. Second, cluster-style maintenance does not work on single-node systems, despite the fact that their unavailability during maintenance can be painful for organizations. In this paper, we propose a novel technique for online maintenance that uses virtual machines to provide maintenance on single nodes, allowing parallel maintenance over multiple nodes, and online maintenance for standalone servers. We present the Microvisor, our prototype virtual machine system that is custom tailored to the needs of online maintenance. Unlike general purpose virtual machine environments that induce continual 10-20% over-head, the Microvisor virtualizes the hardware only during periods of active maintenance, letting the guest OS run at full speed most of the time. Unlike past attempts at virtual machine optimization, we do not compromise OS transparency. We instead give up generality and tailor our virtual machine system to the minimum needs of online maintenance, eschewing features, such as I/O and memory virtualization, that it does not strictly require. The result is a very thin virtual machine system that induces only 5.6% CPU overhead when virtualizing the hardware, and zero CPU overhead when devirtualized. Using the Microvisor, we demonstrate an online OS upgrade on a live, single-node web server, reducing downtime from one hour to less than one minute.