Technical Note: \cal Q-Learning
Machine Learning
Reinforcement learning algorithms for average-payoff Markovian decision processes
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
On Average Versus Discounted Reward Temporal-Difference Learning
Machine Learning
Risk-Sensitive Reinforcement Learning
Machine Learning
Average-Reward Reinforcement Learning for Variance Penalized Markov Decision Problems
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Q-Learning for Risk-Sensitive Control
Mathematics of Operations Research
Risk-sensitive reinforcement learning applied to control under constraints
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
This paper describes compound reinforcement learning (RL) that is an extended RL based on the compound return. Compound RL maximizes the logarithm of expected double-exponentially discounted compound return in return-based Markov decision processes (MDPs). The contributions of this paper are (1) Theoretical description of compound RL that is an extended RL framework for maximizing the compound return in a return-based MDP and (2) Experimental results in an illustrative example and an application to finance.