Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Intra-Option Learning about Temporally Abstract Actions
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Bounding the Suboptimality of Reusing Subproblem
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Efficient Exploration In Reinforcement Learning
Efficient Exploration In Reinforcement Learning
Tree based hierarchical reinforcement learning
Tree based hierarchical reinforcement learning
Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty
Artificial Intelligence Review
Behavior transfer for value-function-based reinforcement learning
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Identifying useful subgoals in reinforcement learning by local graph partitioning
ICML '05 Proceedings of the 22nd international conference on Machine learning
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Value functions for RL-based behavior transfer: a comparative study
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Improving action selection in MDP's via knowledge transfer
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation
Journal of Artificial Intelligence Research
Autonomous transfer for reinforcement learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling
ECML '07 Proceedings of the 18th European conference on Machine Learning
OMBO: An opponent modeling approach
AI Communications
Experiments with Adaptive Transfer Rate in Reinforcement Learning
Knowledge Acquisition: Approaches, Algorithms and Applications
Autonomous inter-task transfer in reinforcement learning domains
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Transferring learned control-knowledge between planners
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Learning relational options for inductive transfer in relational reinforcement learning
ILP'07 Proceedings of the 17th international conference on Inductive logic programming
Empirical analysis of an on-line adaptive system using a mixture of Bayesian networks
Information Sciences: an International Journal
Probabilistic Policy Reuse for inter-task transfer learning
Robotics and Autonomous Systems
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Using spatial hints to improve policy reuse in a reinforcement learning agent
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Policy transfer via Markov logic networks
ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Structural knowledge transfer by spatial abstraction for reinforcement learning agents
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Spatially-aware dialogue control using hierarchical reinforcement learning
ACM Transactions on Speech and Language Processing (TSLP)
Adaptive multi-robot team reconfiguration using a policy-reuse reinforcement learning approach
AAMAS'11 Proceedings of the 10th international conference on Advanced Agent Technology
Using cases as heuristics in reinforcement learning: a transfer learning application
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Transferring evolved reservoir features in reinforcement learning tasks
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Transfer learning in multi-agent reinforcement learning domains
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Reinforcement learning from simultaneous human and MDP reward
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Transfer in reinforcement learning via shared features
The Journal of Machine Learning Research
Towards student/teacher learning in sequential decision tasks
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Adaptive probabilistic policy reuse
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Transferring task models in Reinforcement Learning agents
Neurocomputing
Knowledge-Based Exploration for Reinforcement Learning in Self-Organizing Neural Networks
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Speeding-up reinforcement learning through abstraction and transfer learning
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Learning in non-stationary MDPs as transfer learning
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Machine learning for interactive systems and robots: a brief introduction
Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Hi-index | 0.00 |
We contribute Policy Reuse as a technique to improve a reinforcement learning agent with guidance from past learned similar policies. Our method relies on using the past policies as a probabilistic bias where the learning agent faces three choices: the exploitation of the ongoing learned policy, the exploration of random unexplored actions, and the exploitation of past policies. We introduce the algorithm and its major components: an exploration strategy to include the new reuse bias, and a similarity function to estimate the similarity of past policies with respect to a new one. We provide empirical results demonstrating that Policy Reuse improves the learning performance over different strategies that learn without reuse. Interestingly and almost as a side effect, Policy Reuse also identifies classes of similar policies revealing a basis of core policies of the domain. We demonstrate that such a basis can be built incrementally, contributing the learning of the structure of a domain.