The K best-paths approach to approximate dynamic programming with application to portfolio optimization

Authors:
Nicolas Chapados;Yoshua Bengio
Affiliations:
Dept. IRO, Université de Montréal, Montréal, Québec, Canada;Dept. IRO, Université de Montréal, Montréal, Québec, Canada
Venue:
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Year:
2006

Citing 11
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Computing the K Shortest Paths: A New Algorithm and an Experimental Comparison

WAE '99 Proceedings of the 3rd International Workshop on Algorithm Engineering
Extensions to metric based model selection

The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence)

Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence)
Relating reinforcement learning performance to classification performance

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning to trade via direct reinforcement

IEEE Transactions on Neural Networks
Cost functions and model combination for VaR-based asset allocation using neural networks

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a general method to transform a non-markovian sequential decision problem into a supervised learning problem using a K-best-paths algorithm. We consider an application in financial portfolio management where we can train a controller to directly optimize a Sharpe Ratio (or other risk-averse non-additive) utility function. We illustrate the approach by demonstrating experimental results using a kernel-based controller architecture that would not normally be considered in traditional reinforcement learning or approximate dynamic programming.