Reusing Old Policies to Accelerate Learning on New MDPs TITLE2:

Authors:
D. S. Bernstein
Affiliations:
-
Venue:
Reusing Old Policies to Accelerate Learning on New MDPs TITLE2:
Year:
1999

Citing 0
Cited 5

Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty

Artificial Intelligence Review
Autonomous shaping: knowledge transfer in reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Building portable options: skill transfer in reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using control theory for analysis of reinforcement learning and optimal policy properties in grid-world problems

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Transfer in reinforcement learning via shared features

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the reuse of policies for previous MDPs in learning on a new MDP, under the assumption that the vector of parameters of each MDP is drawn from a fixed probability distribution. We use the options framework, in which an option consists of a set of initiation states, a policy, and a termination condition. We use an option called a \emph{reuse option}, for which the set of initiation states is the set of all states, the policy is a combination of policies from the old MDPs, and the termination condition is based on the number of time steps since the option was initiated. Given policies for $m$ of the MDPs from the distribution, we construct reuse options from the policies and compare performance on an $m+1$st MDP both with and without various reuse options. We find that reuse options can speed initial learning of the $m+1$st task. We also present a distribution of MDPs for which reuse options can slow initial learning. We discuss reasons for this and suggest other ways to design reuse options.