Matrix multiplication via arithmetic progressions
Journal of Symbolic Computation - Special issue on computational algebraic complexity
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Efficient model-based exploration
Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Efficient Bayesian parameter estimation in large discrete domains
Proceedings of the 1998 conference on Advances in neural information processing systems II
Risk sensitive reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Reinforcement Learning with Bounded Risk
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Variance-Penalized Reinforcement Learning for Risk-Averse Asset Allocation
IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
Proceedings of the 24th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
An analysis of model-based Interval Estimation for Markov Decision Processes
Journal of Computer and System Sciences
Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Model based Bayesian exploration
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
Reinforcement learning aims to derive an optimal policy for an often initially unknown environment. In the case of an unknown environment, exploration is used to acquire knowledge about it. In that context the well-known exploration-exploitation dilemma arises---when should one stop to explore and instead exploit the knowledge already gathered? In this paper we propose an uncertainty-based exploration method. We use uncertainty propagation to obtain the Q-function's uncertainty and then use the uncertainty in combination with the Q-values to guide the exploration to promising states that so far have been insufficiently explored. The uncertainty's weight during action selection can be influenced by a parameter. We evaluate one variant of the algorithm using full covariance matrices and two variants using an approximation and demonstrate their functionality on two benchmark problems.