Mean, variance, and probabilistic criteria in finite Markov decision processes: a review
Journal of Optimization Theory and Applications
Average reward reinforcement learning: foundations, algorithms, and empirical results
Machine Learning - Special issue on reinforcement learning
Enhancing Q-learning for optimal asset allocation
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Reinforcement learning for trading
Proceedings of the 1998 conference on Advances in neural information processing systems II
Risk sensitive reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Cultivating desired behaviour: policy teaching via environment-dynamics tweaks
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Non-deterministic policies in Markovian decision processes
Journal of Artificial Intelligence Research
Reinforcement learning approach to multi-stage decision making problems with changes in action sets
Artificial Life and Robotics
Hi-index | 0.00 |
The tasks of optimizing asset allocation considering transaction costs can be formulated into the framework of Markov Decision Processes (MDPs) and reinforcement learning. In this paper, a risk-averse reinforcement learning algorithm is proposed which improves asset allocation strategy of portfolio management systems. The proposed algorithm alternates policy evaluation phases which take into account the mean and variance of return under a given policy and policy improvement phases which follow the variance-penalized criterion. The algorithm is tested on trading systems for a single future corresponding to a Japanese stock index.