Fast transparent migration for virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The impact of management operations on the virtualized datacenter
Proceedings of the 37th annual international symposium on Computer architecture
The design of a practical system for fault-tolerant virtual machines
ACM SIGOPS Operating Systems Review
Enabling a marketplace of clouds: VMware's vCloud director
ACM SIGOPS Operating Systems Review
Content-aware load balancing for distributed backup
LISA'11 Proceedings of the 25th international conference on Large Installation System Administration
Hi-index | 0.00 |
Virtualization drives higher resource utilization and makes provisioning new systems very easy and cheap. This combination has led to an ever-increasing number of virtual machines: the largest data centers will likely have more than 100K in few years, and many deployments will span multiple data centers. Virtual machines are also getting increasingly more capable, consisting of more vCPUs, more memory, and higher-bandwidth virtual I/O devices with a variety of capabilities like bandwidth throttling and traffic mirroring To reduce the work for IT administrators managing these environments, VMware and other companies provide several monitoring, automation, and policy-driven tools. These tools require a lot of information about various aspects of each VM and other objects in the system, such as physical hosts, storage infrastructure, and networking. To support these tools and the hundreds of simultaneous users who manage the environment, the management software needs to provide secure access to the data in real-time with some degree of consistency and backwardcompatibility, and very high availability under a variety of failures and planned maintenance. Such software must satisfy a continuum of designs: it must perform well at large-scale to accommodate the largest datacenters, but it must also accommodate smaller deployments by limiting its resource consumption and overhead according to demand. The need for high-performance, robust management tools that scale from a few hosts to cloud-scale poses interesting challenges for the management software. This paper presents some of the techniques we have employed to address these challenges