SD-Q: selective discount Q learning based on new results of intertemporal choice theory

Authors:
Fengfei Zhao;Zheng Qin
Affiliations:
Department of Computer Science and Technology, Tsinghua University, Beijing, China;Department of Computer Science and Technology, Tsinghua University, Beijing, China
Venue:
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Year:
2011

Citing 5
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
Transfer of Learning by Composing Solutions of Elemental Sequential Tasks

Machine Learning
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We discuss the reinforcement learning from an intertemporal choice perspective. Different from previous research, this paper wants to emphasize the importance of deeper understanding the psychological mechanism of human decision-making. In what follows we aim to improve the previous Q learning algorithm according to the new results of intertemporal choice experiments. We start with a brief introduction to new findings of intertemporal choice theory and reinforcement learning. Then we propose a new reinforcement learning algorithm with selective discount (SD-Q). Experiments show that, SD-Q is superior to both the traditional Q learning algorithm and the reinforcement learning method without considering the discount.