Model-Based Reinforcement Learning in a Complex Domain

Authors:
Shivaram Kalyanakrishnan;Peter Stone;Yaxin Liu
Affiliations:
Department of Computer Sciences, The University of Texas at Austin, Austin, TX 78712-0233;Department of Computer Sciences, The University of Texas at Austin, Austin, TX 78712-0233;Department of Computer Sciences, The University of Texas at Austin, Austin, TX 78712-0233
Venue:
RoboCup 2007: Robot Soccer World Cup XI
Year:
2008

Citing 7
Cited 2

Technical Note: \cal Q-Learning

Machine Learning
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

Machine Learning
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Batch reinforcement learning in a complex domain

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems

An Empirical Analysis of the Impact of Prioritised Sweeping on the DynaQ's Performance

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Transferring task models in Reinforcement Learning agents

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is a paradigm under which an agent seeks to improve its policy by making learning updates based on the experiences it gathers through interaction with the environment. Model-freealgorithms perform updates solely bas ed on observed experiences. By contrast, model-basedalgorithms learn a model of the environment that effectively simulates its dynamics. The model may be used to simulate experiences or to plan into the future, potentially expediting the learning process. This paper presents a model-based reinforcement learning approach for Keepaway, a complex, continuous, stochastic, multiagent subtask of RoboCup simulated soccer. First, we propose the design of an environmental model that is partly learned based on the agent's experiences. This model is then coupled with the reinforcement learning algorithm to learn an action selection policy. We evaluate our method through empirical comparisons with model-free approaches that have been previously applied successfully to this task. Results demonstrate significant gains in the learning speed and asymptotic performance of our method. We also show that the learned model can be used effectively as part of a planning-based approach with a hand-coded policy.