Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

Authors:
Alexander Hans;Steffen Udluft
Affiliations:
Ilmenau University of Technology, Neuroinformatics & Cognitive Robotics Lab, P.O. Box 100565, D-98684 Ilmenau, Germany, email: alexander.hans.ext@siemens.com and Siemens AG, Corporate Research and ...;Siemens AG, Corporate Research and Technologies, Otto-Hahn-Ring 6, D-81739 Munich, Germany, email: steffen.udluft@siemens.com
Venue:
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Year:
2010

Citing 16
Cited 0

Matrix multiplication via arithmetic progressions

Journal of Symbolic Computation - Special issue on computational algebraic complexity
Bayesian Q-learning

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Efficient model-based exploration

Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Efficient Bayesian parameter estimation in large discrete domains

Proceedings of the 1998 conference on Advances in neural information processing systems II
Risk sensitive reinforcement learning

Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Reinforcement Learning with Bounded Risk

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Variance-Penalized Reinforcement Learning for Risk-Averse Asset Allocation

IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Percentile optimization in uncertain Markov decision processes with application to efficient exploration

Proceedings of the 24th international conference on Machine learning
Active reinforcement learning

Proceedings of the 25th international conference on Machine learning
An analysis of model-based Interval Estimation for Markov Decision Processes

Journal of Computer and System Sciences
Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Model based Bayesian exploration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning aims to derive an optimal policy for an often initially unknown environment. In the case of an unknown environment, exploration is used to acquire knowledge about it. In that context the well-known exploration-exploitation dilemma arises---when should one stop to explore and instead exploit the knowledge already gathered? In this paper we propose an uncertainty-based exploration method. We use uncertainty propagation to obtain the Q-function's uncertainty and then use the uncertainty in combination with the Q-values to guide the exploration to promising states that so far have been insufficiently explored. The uncertainty's weight during action selection can be influenced by a parameter. We evaluate one variant of the algorithm using full covariance matrices and two variants using an approximation and demonstrate their functionality on two benchmark problems.