SLA-aware resource over-commit in an IaaS cloud

Authors:
David Breitgand;Zvi Dubitzky;Amir Epstein;Alex Glikson;Inbar Shapira
Affiliations:
IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel
Venue:
Proceedings of the 8th International Conference on Network and Service Management
Year:
2012

Citing 8
Cited 3

Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Resource overbooking and application profiling in a shared Internet hosting platform

ACM Transactions on Internet Technology (TOIT)
pMapper: power and migration cost aware application placement in virtualized systems

Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Efficient resource provisioning in compute clouds via VM multiplexing

Proceedings of the 7th international conference on Autonomic computing
Energy aware consolidation for cloud computing

HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
CPU gradients: Performance-aware energy conservation in multitier systems

GREENCOMP '10 Proceedings of the International Conference on Green Computing
ASAP: A Self-Adaptive Prediction System for Instant Cloud Resource Demand Provisioning

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Towards an understanding of oversubscription in cloud

Hot-ICE'12 Proceedings of the 2nd USENIX conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services

Improving cloud infrastructure utilization through overbooking

Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
A virtual machine re-packing approach to the horizontal vs. vertical elasticity trade-off for cloud autoscaling

Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
Cloudy with a Chance of Load Spikes: Admission Control with Fuzzy Risk Assessments

UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cloud paradigm facilitates cost-efficient elastic computing allowing scaling workloads on demand. As cloud size increases, the probability that all workloads simultaneously scale up to their maximum demand, diminishes. This observation allows multiplexing cloud resources among multiple workloads, greatly improving resource utilization. The ability to host virtualized workloads such that available physical capacity is smaller than the sum of maximal demands of the workloads, is referred to as over-commit or over-subscription. Naturally, over-commit implies risk of resource congestion. Therefore, there is a tradeoff between improving resource utilization by increasing an over-commit ratio and exposing the infrastructure provider and customers to the risk of resource congestion. In this work, we observe that while resource multiplexing naturally occurs in the cloud, the risks associated with exploiting it for higher levels of cloud utilization, are not transparent to the customers. We consider workloads comprising elastic groups of Virtual Machines (VMs). We suggest that cloud providers would extend a standard availability Service Level Agreement (SLA) to express the probability of successfully launching a VM (to expand a workload), complementing the current practice of offering a standard SLA on availability of VMs which are already successfully launched. Using the proposed extended availability SLA, we introduce a notion of the cloud effective demand, which generalizes previously introduced notions of effective size of a single VM and effective bandwidth of stand-alone and multiplexed network connections. We propose an algorithmic framework that uses cloud effective demand to estimate the total physical capacity required for SLA compliance under over-commit. We evaluate our proposed methodology using simulations based on the data collected from a real private cloud production environment.