Evolution Strategies for Direct Policy Search

Authors:
Verena Heidrich-Meisner;Christian Igel
Affiliations:
Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany;Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany
Venue:
Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Year:
2008

Citing 12
Cited 1

Genetic Reinforcement Learning for Neurocontrol Problems

Machine Learning - Special issue on genetic algorithms
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Evolving neural networks through augmenting topologies

Evolutionary Computation
Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES)

Evolutionary Computation
Making Driver Modeling Attractive

IEEE Intelligent Systems
Completely Derandomized Self-Adaptation in Evolution Strategies

Evolutionary Computation
Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Natural Actor-Critic

Neurocomputing
Evolutionary reinforcement learning of artificial neural networks

International Journal of Hybrid Intelligent Systems - Hybridization of Intelligent Systems
Solving non-Markovian control tasks with neuroevolution

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Efficient non-linear control through neuroevolution

ECML'06 Proceedings of the 17th European conference on Machine Learning

Introduction of fixed mode states into online profit sharing and its application to waist trajectory generation of biped robot

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

The covariance matrix adaptation evolution strategy (CMA-ES) is suggested for solving problems described by Markov decision processes. The algorithm is compared with a state-of-the-art policy gradient method and stochastic search on the double cart-pole balancing task using linear policies. The CMA-ES proves to be much more robust than the gradient-based approach in this scenario.