Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

  • Authors:
  • Alexander Hans;Steffen Udluft

  • Affiliations:
  • Ilmenau University of Technology, Neuroinformatics & Cognitive Robotics Lab, P.O. Box 100565, D-98684 Ilmenau, Germany, email: alexander.hans.ext@siemens.com and Siemens AG, Corporate Research and ...;Siemens AG, Corporate Research and Technologies, Otto-Hahn-Ring 6, D-81739 Munich, Germany, email: steffen.udluft@siemens.com

  • Venue:
  • Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reinforcement learning aims to derive an optimal policy for an often initially unknown environment. In the case of an unknown environment, exploration is used to acquire knowledge about it. In that context the well-known exploration-exploitation dilemma arises---when should one stop to explore and instead exploit the knowledge already gathered? In this paper we propose an uncertainty-based exploration method. We use uncertainty propagation to obtain the Q-function's uncertainty and then use the uncertainty in combination with the Q-values to guide the exploration to promising states that so far have been insufficiently explored. The uncertainty's weight during action selection can be influenced by a parameter. We evaluate one variant of the algorithm using full covariance matrices and two variants using an approximation and demonstrate their functionality on two benchmark problems.