Technical Note: \cal Q-Learning
Machine Learning
Computer graphics (2nd ed. in C): principles and practice
Computer graphics (2nd ed. in C): principles and practice
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
This paper presents biologically plausible computational models of brain areas involved in emotion processing and the decision-making process. In the models, the amygdala, the orbotofrontal cortex (OFC) and the basal ganglia work together as a multiple-level hierarchical reinforcement learning system. The amygdala decodes sensory cues into reward-related variables providing a reward-related abstract representation for the decision making process in the OFC, while the basal ganglia learn and execute subtask policies. Here we hypothesize how the amygdala may learn these representations. The models have been implemented in software to control a Khepera robot in a physical environment designed for comparison with animal behaviours. We show that the representation of principal emotion components in the reward function may lead to a more efficient learning algorithm than general Q learning.