Probabilistic Policy Reuse for inter-task transfer learning

Authors:
Fernando Fernández;Javier García;Manuela Veloso
Affiliations:
Universidad Carlos III de Madrid, 28911 Leganes, Madrid, Spain;Universidad Carlos III de Madrid, 28911 Leganes, Madrid, Spain;Carnegie Mellon University, Pittsburgh, United States
Venue:
Robotics and Autonomous Systems
Year:
2010

Citing 12
Cited 3

Practical Issues in Temporal Difference Learning

Machine Learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Behavior transfer for value-function-based reinforcement learning

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Probabilistic policy reuse in a reinforcement learning agent

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

The Journal of Machine Learning Research
Two steps reinforcement learning

International Journal of Intelligent Systems
Inter-task action correlation for reinforcement learning tasks

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Using Homomorphisms to transfer options across continuous reinforcement learning domains

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Value functions for RL-based behavior transfer: a comparative study

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Using advice to transfer knowledge acquired in one reinforcement learning task to another

ECML'05 Proceedings of the 16th European conference on Machine Learning

Integrating reinforcement learning with human demonstrations of varying ability

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Multi-agent reinforcement learning for simulating pedestrian navigation

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Speeding-up reinforcement learning through abstraction and transfer learning

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration by probabilistically including the exploitation of those past policies. Policy Reuse was introduced, and its effectiveness was previously demonstrated, in problems with different reward functions in the same state and action spaces. In this article, we contribute Policy Reuse as transfer learning among different domains. We introduce extended Markov Decision Processes (MDPs) to include domains and tasks, where domains have different state and action spaces, and tasks are problems with different rewards within a domain. We show how Policy Reuse can be applied among domains by defining and using a mapping between their state and action spaces. We use several domains, as versions of a simulated RoboCup Keepaway problem, where we show that Policy Reuse can be used as a mechanism of transfer learning significantly outperforming a basic policy learner.