Improving action selection in MDP's via knowledge transfer

Authors:
Alexander A. Sherstov;Peter Stone
Affiliations:
Department of Computer Sciences, The University of Texas at Austin, Austin, TX;Department of Computer Sciences, The University of Texas at Austin, Austin, TX
Venue:
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Year:
2005

Citing 8
Cited 10

TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
Experiments with reinforcement learning in problems with continuous state and action spaces

Adaptive Behavior
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Scaling Reinforcement Learning toward RoboCup Soccer

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Generalizing plans to new environments in relational MDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Symbolic dynamic programming for first-order MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Probabilistic policy reuse in a reinforcement learning agent

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Epoch-Incremental Queue-Dyna Algorithm

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Efficient reinforcement learning with relocatable action models

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Exploring compact reinforcement-learning representations with linear regression

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Using spatial hints to improve policy reuse in a reinforcement learning agent

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Improving space representation in multiagent learning via tile coding

SBIA'10 Proceedings of the 20th Brazilian conference on Advances in artificial intelligence
Using advice to transfer knowledge acquired in one reinforcement learning task to another

ECML'05 Proceedings of the 16th European conference on Machine Learning
Stochastic abstract policies for knowledge transfer in robotic navigation tasks

MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Learning potential functions and their representations for multi-task reinforcement learning

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Temporal-difference reinforcement learning (RL) has been successfully applied in several domains with large state sets. Large action sets, however, have received considerably less attention. This paper demonstrates the use of knowledge transfer between related tasks to accelerate learning with large action sets. We introduce action transfer, a technique that extracts the actions from the (near-)optimal solution to the first task and uses them in place of the full action set when learning any subsequent tasks. When optimal actions make up a small fraction of the domain's action set, action transfer can substantially reduce the number of actions and thus the complexity of the problem. However, action transfer between dissimilar tasks can be detrimental. To address this difficulty, we contribute randomized task perturbation (RTP), an enhancement to action transfer that makes it robust to unrepresentative source tasks. We motivate RTP action transfer with a detailed theoretical analysis featuring a formalism of related tasks and a bound on the suboptimality of action transfer. The empirical results in this paper show the potential of RTP action transfer to substantially expand the applicability of RL to problems with large action sets.