Autonomic application and resource management in virtualized distributed computing systems

  • Authors:
  • Jose Fortes;Jing Xu

  • Affiliations:
  • University of Florida;University of Florida

  • Venue:
  • Autonomic application and resource management in virtualized distributed computing systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale distributed computing systems, such as computational grids and enterprise data centers, present complex management challenges. Such systems experience inherent dynamism due to unpredictable resource availability and usage, or/and highly dynamic workloads. By introducing a layer of abstraction, virtualization technology provides ways of provisioning and customizing resource environments as needed, and migrating workloads to adapt to dynamic changes. However, the scale of such computing systems makes it extremely hard to control them manually by one or more human operators.Our solution is to incorporate autonomic capabilities into the management of applications and resources in grid and data center environments to reduce direct human intervention. Such capabilities are accomplished through a two-level feedback-control framework in which local controllers at the application level have detailed information about the applications and allow independent adaptation and optimization. The global controller at the resource level collects resource information and optimizes the system behavior from a global perspective. It also acts as a coordinator when conflicts occur at different local controllers. For grid environments, the proposed two-level control system is studied in the context of In-VIGO, a grid-computing system that provides application services on-demand using dynamically instantiated virtual machines, networks, data and applications. Local controllers utilize application-specific information for tracking and predicting the performance of jobs executing on grid resources, which is then used to guide the scheduling/rescheduling decisions. Its effectiveness has been evaluated for CPU-intensive jobs with relatively short execution times (ranging from tens of seconds to less than an hour) on resources with highly variable loads. The results show that In-VIGO jobs managed by the two-level controllers consistently meet their execution deadlines under varying load conditions and gracefully recover from unexpected failures. Under the most dynamic and heavy loading environment created by the experiments, the average job runtime of the proposed approach is 10% and 20% shorter than two other competing scheduling strategies, one using round-robin and the other using the same scheduling as the proposed approach but without rescheduling actions. The percentage of jobs meeting their predefined deadlines is improved by 40% and 50%, respectively.In a virtualized data center, the two-level control system is designed to deliver performance guarantees while optimizing resource usage, and also other important aspects of data centers such as power and cooling costs. At the application level, two fuzzy-logic-based methods—fuzzy modeling and fuzzy prediction—are proposed to estimate the resource demands for dynamic workloads. The global controller at the resource level tries to find the optimal resource allocation and virtual machine (VM) placement/replacement, with multiple objectives including the elimination of thermal hotspots, the minimization of total power consumption, and the efficient use of resources. The problem is posed as a multi-objective combinatorial optimization problem and an improved genetic algorithm with fuzzy multi-objective evaluation is proposed for efficiently searching the large solution space and conveniently combining possibly conflicting objectives. An online local search algorithm using multi-objective optimization and stabilization techniques is designed for dynamically changing virtual machine placement to quickly adapt to changes in system conditions or workloads. The proposed approaches are implemented and evaluated on a virtualized testbed built upon an IBM BladeCenter. Under both synthetic and real-world Web workloads the local controller is validated to accurately estimate resource needs (the difference is within 5%) using fuzzy modeling and fuzzy prediction approaches. The global controller for determining virtual machine placement is tested with simulation-based experiments over a wide range of problem sizes and the results show that the multi-objective optimization using genetic algorithm achieve good balance among different objectives, resulting in relatively low values for power consumption, peak temperature, and resource wastage. For the dynamic virtual machine migration problem, experimental evaluations are conducted using a mix of types of workloads to emulate the variety and dynamics of data center workloads. The results indicate that the proposed multi-objective optimization with stabilization significantly reduces unnecessary VM migration and unstable host selection by up to 80% and also improves the application performance by up to 30% and the efficiencies of power usage by up to 20%.The rapid growth of computing systems raises new challenges for centralized management at the global-control level in the proposed two-level architecture. A network of cooperative controllers is proposed in this work, each managing a subset of resources and collectively collaborating to manage the entire system. The proposed network model is validated on a testbed for In-VIGO and the results show that the decentralized and cooperative nature of the system yields a number of desirable properties, including efficiency, robustness, and scalability under a highly dynamic environment.