Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty
Artificial Intelligence Review
Autonomous shaping: knowledge transfer in reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Building portable options: skill transfer in reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Transfer in reinforcement learning via shared features
The Journal of Machine Learning Research
Hi-index | 0.00 |
We consider the reuse of policies for previous MDPs in learning on a new MDP, under the assumption that the vector of parameters of each MDP is drawn from a fixed probability distribution. We use the options framework, in which an option consists of a set of initiation states, a policy, and a termination condition. We use an option called a \emph{reuse option}, for which the set of initiation states is the set of all states, the policy is a combination of policies from the old MDPs, and the termination condition is based on the number of time steps since the option was initiated. Given policies for $m$ of the MDPs from the distribution, we construct reuse options from the policies and compare performance on an $m+1$st MDP both with and without various reuse options. We find that reuse options can speed initial learning of the $m+1$st task. We also present a distribution of MDPs for which reuse options can slow initial learning. We discuss reasons for this and suggest other ways to design reuse options.