On the use of hybrid reinforcement learning for autonomic resource allocation

Authors:
Gerald Tesauro;Nicholas K. Jong;Rajarshi Das;Mohamed N. Bennani
Affiliations:
IBM TJ Watson Research Center, Hawthorne, USA 10532;Dept. of Computer Sciences, Univ. of Texas, Austin, USA 78712;IBM TJ Watson Research Center, Hawthorne, USA 10532;Oracle Inc., Portland, USA 97204
Venue:
Cluster Computing
Year:
2007

Citing 23
Cited 19

Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Temporal difference learning and TD-Gammon

Communications of the ACM
How to dynamically merge Markov decision processes

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Internet traffic: periodicity, tail behavior, and performance implications

System performance evaluation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
The Vision of Autonomic Computing

Computer
Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Dynamic Programming

Dynamic Programming
Least-squares policy iteration

The Journal of Machine Learning Research
Feedback Control of Computing Systems

Feedback Control of Computing Systems
Performance by Design: Computer Capacity Planning By Example

Performance by Design: Computer Capacity Planning By Example
An analytical model for multi-tier internet services and its applications

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Assessing the Robustness of Self-Managing Computer Systems under Highly Variable Workloads

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Utility Functions in Autonomic Systems

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Resource Allocation for Autonomic Data Centers using Analytic Performance Models

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
A Reinforcement Learning Framework for Dynamic Resource Allocation: First Results.

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Utility-Function-Driven Resource Allocation in Autonomic Systems

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation

ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Online resource allocation using decompositional reinforcement learning

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Adaptive job routing and scheduling

Engineering Applications of Artificial Intelligence
Dynamic resource allocation for shared data centers using online measurements

IWQoS'03 Proceedings of the 11th international conference on Quality of service
Learning to trade via direct reinforcement

IEEE Transactions on Neural Networks

Autonomic resource management in virtualized data centers using fuzzy logic-based approaches

Cluster Computing
Scheduling for Reliable Execution in Autonomic Systems

ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
Automated Generation of Knowledge Plane Components for Multimedia Access Networks

MACE '08 Proceedings of the 3rd IEEE international workshop on Modelling Autonomic Communications Environments
Elicitation and utilization of application-level utility functions

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
VCONF: a reinforcement learning approach to virtual machines auto-configuration

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
The grid observatory

GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
Responsive elastic computing

GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
An empirical analysis of value function-based and policy search reinforcement learning

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Scheduling policy design for autonomic systems

International Journal of Autonomous and Adaptive Communications Systems
Globally Optimal Multi-agent Reinforcement Learning Parameters in Distributed Task Assignment

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Discovering Piecewise Linear Models of Grid Workload

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Translation of service level agreements: a generic problem definition

ICSOC/ServiceWave'09 Proceedings of the 2009 international conference on Service-oriented computing
Synthesis of application-level utility functions for autonomic self-assessment

Cluster Computing
Optimizing queries to remote resources

Journal of Intelligent Information Systems
Towards Non-Stationary Grid Models

Journal of Grid Computing
URL: A unified reinforcement learning approach for autonomic cloud management

Journal of Parallel and Distributed Computing
A swarm-inspired data center consolidation methodology

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Dynamic, behavioral-based estimation of resource provisioning based on high-level application terms in Cloud platforms

Future Generation Computer Systems
Data center selection based on neuro-fuzzy inference systems in cloud computing environments

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement Learning (RL) provides a promising new approach to systems performance management that differs radically from standard queuing-theoretic approaches making use of explicit system performance models. In principle, RL can automatically learn high-quality management policies without an explicit performance model or traffic model, and with little or no built-in system specific knowledge. In our original work (Das, R., Tesauro, G., Walsh, W.E.: IBM Research, Tech. Rep. RC23802 (2005), Tesauro, G.: In: Proc. of AAAI-05, pp. 886---891 (2005), Tesauro, G., Das, R., Walsh, W.E., Kephart, J.O.: In: Proc. of ICAC-05, pp. 342---343 (2005)) we showed the feasibility of using online RL to learn resource valuation estimates (in lookup table form) which can be used to make high-quality server allocation decisions in a multi-application prototype Data Center scenario. The present work shows how to combine the strengths of both RL and queuing models in a hybrid approach, in which RL trains offline on data collected while a queuing model policy controls the system. By training offline we avoid suffering potentially poor performance in live online training. We also now use RL to train nonlinear function approximators (e.g. multi-layer perceptrons) instead of lookup tables; this enables scaling to substantially larger state spaces. Our results now show that, in both open-loop and closed-loop traffic, hybrid RL training can achieve significant performance improvements over a variety of initial model-based policies. We also find that, as expected, RL can deal effectively with both transients and switching delays, which lie outside the scope of traditional steady-state queuing theory.