Neuronlike adaptive elements that can solve difficult learning control problems
Artificial neural networks
Artificial Intelligence Review - Special issue on lazy learning
Autonomous shaping: knowledge transfer in reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Transfer Learning via Inter-Task Mappings for Temporal Difference Learning
The Journal of Machine Learning Research
Graph-Based Domain Mapping for Transfer Learning in General Games
ECML '07 Proceedings of the 18th European conference on Machine Learning
Transferring Instances for Model-Based Reinforcement Learning
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Value-function-based transfer for reinforcement learning using structure mapping
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Using Homomorphisms to transfer options across continuous reinforcement learning domains
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Reinforcement Learning and Dynamic Programming Using Function Approximators
Reinforcement Learning and Dynamic Programming Using Function Approximators
Using advice to transfer knowledge acquired in one reinforcement learning task to another
ECML'05 Proceedings of the 16th European conference on Machine Learning
Reinforcement Learning: An Introduction
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Agents in reinforcement learning tasks may learn slowly in large or complex tasks -- transfer learning is one technique to speed up learning by providing an informative prior. How to best enable transfer between tasks with different state representations and/or actions is currently an open question. This paper introduces the concept of a common task subspace, which is used to autonomously learn how two tasks are related. Experiments in two different nonlinear domains empirically show that a learned inter-state mapping can successfully be used by fitted value iteration, to (1) improving the performance of a policy learned with a fixed number of samples, and (2) reducing the time required to converge to a (near-) optimal policy with unlimited samples.