Empirical virtual machine models for performance guarantees

  • Authors:
  • Andrew Turner;Akkarit Sangpetch;Hyong S. Kim

  • Affiliations:
  • Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA;Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA;Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • LISA'10 Proceedings of the 24th international conference on Large installation system administration
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing Virtual Machine (VM) management systems rely on host resource utilization metrics to allocate and schedule VMs. Many management systems only consolidate and migrate VMs based on hosts' CPU utilizations. However, the performance of delay-sensitive workloads, such as web services and online transaction processing, can be severely degraded by contention on numerous of the hosts' components. Current VM management systems typically use threshold based rules to decide when to migrate VMs, rather than using application-level performance. This means that they cannot easily provide application-level service level objective (SLO) guarantees. Providing SLO guarantees is even more difficult when considering that today's enterprise applications often consist of multiple VM tiers. In this paper we show how the performance of a multi-tiered VM application can be empirically captured, modeled and scaled. This allows our management system to guarantee application-level performance, despite variable host utilization and VM workload levels. Additionally, it can predict the performance of an application at host utilization levels that have not been previously observed. This is achieved by performing regression analysis on the previously observed values and scaling the applications performance model. This allows the performance of a VM to be predicted before it is migrated to or from a new host. We have found that by dynamically, rather than statically, allocating resources, average response time can be improved by 30%. Additionally, we found that resource allocations can be reduced by 20%, with no degradation in response time.