Metalearning and neuromodulation
Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine
Neural Networks - Computational models of neuromodulation
Control of exploitation-exploration meta-parameter in reinforcement learning
Neural Networks - Computational models of neuromodulation
Reinforcement Learning in Continuous Time and Space
Neural Computation
On the complexity of solving Markov decision problems
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Neural mechanism for stochastic behaviour during a competitive game
Neural Networks - 2006 Special issue: Neurobiology of decision making
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
On Affect and Self-adaptation: Potential Benefits of Valence-Controlled Action-Selection
IWINAC '07 Proceedings of the 2nd international work-conference on The Interplay Between Natural and Artificial Computation, Part I: Bio-inspired Modeling of Cognitive Tasks
Strategies for Affect-Controlled Action-Selection in Soar-RL
IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Interpreting dopamine activities in stochastic reward tasks
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
TAROS'11 Proceedings of the 12th Annual conference on Towards autonomous robotic systems
Implementation of cooperative cognition under web environment
ISPA'05 Proceedings of the 2005 international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
Meta-parameters in reinforcement learning should be tuned to the environmental dynamics and the animal performance. Here, we propose a biologically plausible meta-reinforcement learning algorithm for tuning these meta-parameters in a dynamic, adaptive manner. We tested our algorithm in both a simulation of a Markov decision task and in a non-linear control task. Our results show that the algorithm robustly finds appropriate meta-parameter values, and controls the meta-parameter time course, in both static and dynamic environments. We suggest that the phasic and tonic components of dopamine neuron firing can encode the signal required for meta-learning of reinforcement learning.