Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Temporal difference learning and TD-Gammon
Communications of the ACM
How to dynamically merge Markov decision processes
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Internet traffic: periodicity, tail behavior, and performance implications
System performance evaluation
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
The Vision of Autonomic Computing
Computer
Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Dynamic Programming
Least-squares policy iteration
The Journal of Machine Learning Research
Feedback Control of Computing Systems
Feedback Control of Computing Systems
Performance by Design: Computer Capacity Planning By Example
Performance by Design: Computer Capacity Planning By Example
An analytical model for multi-tier internet services and its applications
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Assessing the Robustness of Self-Managing Computer Systems under Highly Variable Workloads
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Utility Functions in Autonomic Systems
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Resource Allocation for Autonomic Data Centers using Analytic Performance Models
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
A Reinforcement Learning Framework for Dynamic Resource Allocation: First Results.
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Utility-Function-Driven Resource Allocation in Autonomic Systems
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation
ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Online resource allocation using decompositional reinforcement learning
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Accelerating reinforcement learning through implicit imitation
Journal of Artificial Intelligence Research
Adaptive job routing and scheduling
Engineering Applications of Artificial Intelligence
Dynamic resource allocation for shared data centers using online measurements
IWQoS'03 Proceedings of the 11th international conference on Quality of service
Learning to trade via direct reinforcement
IEEE Transactions on Neural Networks
Scheduling for Reliable Execution in Autonomic Systems
ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
Automated Generation of Knowledge Plane Components for Multimedia Access Networks
MACE '08 Proceedings of the 3rd IEEE international workshop on Modelling Autonomic Communications Environments
Elicitation and utilization of application-level utility functions
ICAC '09 Proceedings of the 6th international conference on Autonomic computing
VCONF: a reinforcement learning approach to virtual machines auto-configuration
ICAC '09 Proceedings of the 6th international conference on Autonomic computing
GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
An empirical analysis of value function-based and policy search reinforcement learning
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Scheduling policy design for autonomic systems
International Journal of Autonomous and Adaptive Communications Systems
Globally Optimal Multi-agent Reinforcement Learning Parameters in Distributed Task Assignment
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Discovering Piecewise Linear Models of Grid Workload
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Translation of service level agreements: a generic problem definition
ICSOC/ServiceWave'09 Proceedings of the 2009 international conference on Service-oriented computing
Optimizing queries to remote resources
Journal of Intelligent Information Systems
Towards Non-Stationary Grid Models
Journal of Grid Computing
URL: A unified reinforcement learning approach for autonomic cloud management
Journal of Parallel and Distributed Computing
A swarm-inspired data center consolidation methodology
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Future Generation Computer Systems
Data center selection based on neuro-fuzzy inference systems in cloud computing environments
The Journal of Supercomputing
Hi-index | 0.00 |
Reinforcement Learning (RL) provides a promising new approach to systems performance management that differs radically from standard queuing-theoretic approaches making use of explicit system performance models. In principle, RL can automatically learn high-quality management policies without an explicit performance model or traffic model, and with little or no built-in system specific knowledge. In our original work (Das, R., Tesauro, G., Walsh, W.E.: IBM Research, Tech. Rep. RC23802 (2005), Tesauro, G.: In: Proc. of AAAI-05, pp. 886---891 (2005), Tesauro, G., Das, R., Walsh, W.E., Kephart, J.O.: In: Proc. of ICAC-05, pp. 342---343 (2005)) we showed the feasibility of using online RL to learn resource valuation estimates (in lookup table form) which can be used to make high-quality server allocation decisions in a multi-application prototype Data Center scenario. The present work shows how to combine the strengths of both RL and queuing models in a hybrid approach, in which RL trains offline on data collected while a queuing model policy controls the system. By training offline we avoid suffering potentially poor performance in live online training. We also now use RL to train nonlinear function approximators (e.g. multi-layer perceptrons) instead of lookup tables; this enables scaling to substantially larger state spaces. Our results now show that, in both open-loop and closed-loop traffic, hybrid RL training can achieve significant performance improvements over a variety of initial model-based policies. We also find that, as expected, RL can deal effectively with both transients and switching delays, which lie outside the scope of traditional steady-state queuing theory.