Power and performance management of virtualized computing environments via lookahead control

  • Authors:
  • Dara Kusic;Jeffrey O. Kephart;James E. Hanson;Nagarajan Kandasamy;Guofei Jiang

  • Affiliations:
  • Electrical and Computer Engineering Department, Drexel University, Philadelphia, USA 19104;Agents and Emergent Phenomena Group, IBM T.J. Watson Research Center, Hawthorne, USA 10532;Agents and Emergent Phenomena Group, IBM T.J. Watson Research Center, Hawthorne, USA 10532;Electrical and Computer Engineering Department, Drexel University, Philadelphia, USA 19104;Robust and Secure System Group, NEC Laboratories America, Princeton, USA 08540

  • Venue:
  • Cluster Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is growing incentive to reduce the power consumed by large-scale data centers that host online services such as banking, retail commerce, and gaming. Virtualization is a promising approach to consolidating multiple online services onto a smaller number of computing resources. A virtualized server environment allows computing resources to be shared among multiple performance-isolated platforms called virtual machines. By dynamically provisioning virtual machines, consolidating the workload, and turning servers on and off as needed, data center operators can maintain the desired quality-of-service (QoS) while achieving higher server utilization and energy efficiency. We implement and validate a dynamic resource provisioning framework for virtualized server environments wherein the provisioning problem is posed as one of sequential optimization under uncertainty and solved using a lookahead control scheme. The proposed approach accounts for the switching costs incurred while provisioning virtual machines and explicitly encodes the corresponding risk in the optimization problem. Experiments using the Trade6 enterprise application show that a server cluster managed by the controller conserves, on average, 22% of the power required by a system without dynamic control while still maintaining QoS goals. Finally, we use trace-based simulations to analyze controller performance on server clusters larger than our testbed, and show how concepts from approximation theory can be used to further reduce the computational burden of controlling large systems.