A model of the smooth pursuit eye movement system
Biological Cybernetics
Technical Note: \cal Q-Learning
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Metalearning and neuromodulation
Neural Networks - Computational models of neuromodulation
Control of exploitation-exploration meta-parameter in reinforcement learning
Neural Networks - Computational models of neuromodulation
Self-teaching adaptive dynamic programming for Gomoku
Neurocomputing
Hi-index | 0.01 |
We investigated the characteristics of the human action-selection in performing a Markov decision process (MDP) task, and compared them to those of reinforcement-learning (RL) agents. The behavior of human participants was roughly classified into two qualitatively different types. On the other hand, surprisingly, the variety of human behavior could be explained simply by a single parameter of the degree of randomness (i.e., the temperature parameter) in the action-selection rules of the RL agents. This result implies that the various behaviors of human action-selection may be determined by a simple mechanism in the brain.