Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Using Options for Knowledge Transfer in Reinforcement Learning TITLE2:
Using Options for Knowledge Transfer in Reinforcement Learning TITLE2:
Least-squares policy iteration
The Journal of Machine Learning Research
Learning state-action basis functions for hierarchical MDPs
Proceedings of the 24th international conference on Machine learning
Transfer Learning via Inter-Task Mappings for Temporal Difference Learning
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Value functions for RL-based behavior transfer: a comparative study
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Building portable options: skill transfer in reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Transfer in reinforcement learning via shared features
The Journal of Machine Learning Research
Construction of approximation spaces for reinforcement learning
The Journal of Machine Learning Research
Learning potential functions and their representations for multi-task reinforcement learning
Autonomous Agents and Multi-Agent Systems
Hi-index | 0.00 |
Reinforcement Learning research is traditionally devoted to solve single-task problems. Therefore, anytime a new task is faced, learning must be restarted from scratch. Recently, several studies have addressed the issue of reusing the knowledge acquired in solving previous related tasks by transferring information about policies and value functions. In this paper, we analyze the use of proto-value functions under the transfer learning perspective. Proto-value functions are effective basis functions for the approximation of value functions defined over the graph obtained by a random walk on the environment. The definition of this graph is a key aspect in transfer transfer problems in which both the reward function and the dynamics change. Therefore, we introduce policy-based proto-value functions, which can be obtained by considering the graph generated by a random walk guided by the optimal policy of one of the tasks at hand. We compare the effectiveness of policy-based and standard proto-value functions, on different transfer problems defined on a simple grid-world environment.