Adaptive exploration using stochastic neurons

  • Authors:
  • Michel Tokic;Günther Palm

  • Affiliations:
  • Institute of Neural Information Processing, University of Ulm, Germany,Institute of Applied Research, University of Applied Sciences Ravensburg-Weingarten, Germany;Institute of Neural Information Processing, University of Ulm, Germany

  • Venue:
  • ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Stochastic neurons are deployed for efficient adaptation of exploration parameters by gradient-following algorithms. The approach is evaluated in model-free temporal-difference learning using discrete actions. The advantage is in particular memory efficiency, because memorizing exploratory data is only required for starting states. Hence, if a learning problem consist of only one starting state, exploratory data can be considered as being global. Results suggest that the presented approach can be efficiently combined with standard off- and on-policy algorithms such as Q-learning and Sarsa.