Planning and acting in partially observable stochastic domains
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Noisy Local Optimization with Evolution Strategies
Noisy Local Optimization with Evolution Strategies
Making Driver Modeling Attractive
IEEE Intelligent Systems
Completely Derandomized Self-Adaptation in Evolution Strategies
Evolutionary Computation
Evolutionary Function Approximation for Reinforcement Learning
The Journal of Machine Learning Research
Neurocomputing
Evolutionary reinforcement learning of artificial neural networks
International Journal of Hybrid Intelligent Systems - Hybridization of Intelligent Systems
Accelerated Neural Evolution through Cooperatively Coevolved Synapses
The Journal of Machine Learning Research
The Journal of Machine Learning Research
IEEE Transactions on Evolutionary Computation - Special issue on computational finance and economics
Integrating techniques from statistical ranking into evolutionary algorithms
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Using the uncertainty handling CMA-ES for finding robust optima
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Noisy optimization complexity under locality assumption
Proceedings of the twelfth workshop on Foundations of genetic algorithms XII
Hi-index | 0.00 |
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an adaptive uncertainty handling mechanism. Because uncertainty is a typical property of RL problems this new algorithm, termed UH-CMA-ES, is promising for RL. The UH-CMA-ES dynamically adjusts the number of episodes considered in each evaluation of a policy. It controls the signal to noise ratio such that it is just high enough for a sufficiently good ranking of candidate policies, which in turn allows the evolutionary learning to find better solutions. This significantly increases the learning speed as well as the robustness without impairing the quality of the final solutions. We evaluate the UH-CMA-ES on fully and partially observable Markov decision processes with random start states and noisy observations. A canonical natural policy gradient method and random search serve as a baseline for comparison.