Q-Learning for Risk-Sensitive Control

Authors:
V. S. Borkar
Affiliations:
-
Venue:
Mathematics of Operations Research
Year:
2002

Citing 0
Cited 9

STEWARD: demo of spatio-textual extraction on the web aiding the retrieval of documents

dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
On step sizes, stochastic shortest paths, and survival probabilities in reinforcement learning

Proceedings of the 40th Conference on Winter Simulation
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
Risk-sensitive reinforcement learning applied to control under constraints

Journal of Artificial Intelligence Research
Path selection in disaster response management based on Q-learning

International Journal of Automation and Computing
Reinforcement learning for model building and variance-penalized control

Winter Simulation Conference
Compound reinforcement learning: theory and an application to finance

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Oja's algorithm for graph clustering, Markov spectral decomposition, and risk sensitive control

Automatica (Journal of IFAC)
Variable risk control via stochastic optimization

International Journal of Robotics Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose for risk-sensitive control of finite Markov chains a counterpart of the popular Q-learning algorithm for classical Markov decision processes. The algorithm is shown to converge with probability one to the desired solution. The proof technique is an adaptation of the o.d.e. approach for the analysis of stochastic approximation algorithms, with most of the work involved used for the analysis of the specific o.d.e.s that arise.