Technical Note: \cal Q-Learning
Machine Learning
The MAXQ Method for Hierarchical Reinforcement Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
We discuss the reinforcement learning from an intertemporal choice perspective. Different from previous research, this paper wants to emphasize the importance of deeper understanding the psychological mechanism of human decision-making. In what follows we aim to improve the previous Q learning algorithm according to the new results of intertemporal choice experiments. We start with a brief introduction to new findings of intertemporal choice theory and reinforcement learning. Then we propose a new reinforcement learning algorithm with selective discount (SD-Q). Experiments show that, SD-Q is superior to both the traditional Q learning algorithm and the reinforcement learning method without considering the discount.