Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty
Artificial Intelligence Review
Probabilistic policy reuse in a reinforcement learning agent
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Transfer Learning via Inter-Task Mappings for Temporal Difference Learning
The Journal of Machine Learning Research
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Improving action selection in MDP's via knowledge transfer
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Adaptive multi-robot team reconfiguration using a policy-reuse reinforcement learning approach
AAMAS'11 Proceedings of the 10th international conference on Advanced Agent Technology
Hi-index | 0.00 |
We study the problem of knowledge reuse by a reinforcement learning agent. We are interested in how an agent can exploit policies that were learned in the past to learn a new task more efficiently in the present. Our approach is to elicit spatial hints from an expert suggesting the world states in which each existing policy should be more relevant to the new task. By using these hints with domain exploration, the agent is able to detect those portions of existing policies that are beneficial to the new task, therefore learning a new policy more efficiently. We call our approach Spatial Hints Policy Reuse (SHPR). Experiments demonstrate the effectiveness and robustness of our method. Our results encourage further study investigating how much more efficacy can be gained from the elicitation of very simple advice from humans.