Practical Issues in Temporal Difference Learning
Machine Learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Behavior transfer for value-function-based reinforcement learning
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Probabilistic policy reuse in a reinforcement learning agent
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Transfer Learning via Inter-Task Mappings for Temporal Difference Learning
The Journal of Machine Learning Research
Two steps reinforcement learning
International Journal of Intelligent Systems
Inter-task action correlation for reinforcement learning tasks
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Using Homomorphisms to transfer options across continuous reinforcement learning domains
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Value functions for RL-based behavior transfer: a comparative study
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Using advice to transfer knowledge acquired in one reinforcement learning task to another
ECML'05 Proceedings of the 16th European conference on Machine Learning
Integrating reinforcement learning with human demonstrations of varying ability
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Multi-agent reinforcement learning for simulating pedestrian navigation
ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Speeding-up reinforcement learning through abstraction and transfer learning
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Hi-index | 0.00 |
Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration by probabilistically including the exploitation of those past policies. Policy Reuse was introduced, and its effectiveness was previously demonstrated, in problems with different reward functions in the same state and action spaces. In this article, we contribute Policy Reuse as transfer learning among different domains. We introduce extended Markov Decision Processes (MDPs) to include domains and tasks, where domains have different state and action spaces, and tasks are problems with different rewards within a domain. We show how Policy Reuse can be applied among domains by defining and using a mapping between their state and action spaces. We use several domains, as versions of a simulated RoboCup Keepaway problem, where we show that Policy Reuse can be used as a mechanism of transfer learning significantly outperforming a basic policy learner.