Uncertainty handling CMA-ES for reinforcement learning

  • Authors:
  • Verena Heidrich-Meisner;Christian Igel

  • Affiliations:
  • Ruhr-Universität, Bochum, Germany;Ruhr-Universität, Bochum, Germany

  • Venue:
  • Proceedings of the 11th Annual conference on Genetic and evolutionary computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an adaptive uncertainty handling mechanism. Because uncertainty is a typical property of RL problems this new algorithm, termed UH-CMA-ES, is promising for RL. The UH-CMA-ES dynamically adjusts the number of episodes considered in each evaluation of a policy. It controls the signal to noise ratio such that it is just high enough for a sufficiently good ranking of candidate policies, which in turn allows the evolutionary learning to find better solutions. This significantly increases the learning speed as well as the robustness without impairing the quality of the final solutions. We evaluate the UH-CMA-ES on fully and partially observable Markov decision processes with random start states and noisy observations. A canonical natural policy gradient method and random search serve as a baseline for comparison.