Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search

Authors:
Verena Heidrich-Meisner;Christian Igel
Affiliations:
Institut für Neuroinformatik, Bochum, Germany;Institut für Neuroinformatik, Bochum, Germany
Venue:
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Year:
2009

Citing 15
Cited 14

The Racing Algorithm: Model Selection for Lazy Learners

Artificial Intelligence Review - Special issue on lazy learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Averaging Efficiently in the Presence of Noise

PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
A Racing Algorithm for Configuring Metaheuristics

GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES)

Evolutionary Computation
Making Driver Modeling Attractive

IEEE Intelligent Systems
Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Natural Actor-Critic

Neurocomputing
Evolutionary reinforcement learning of artificial neural networks

International Journal of Hybrid Intelligent Systems - Hybridization of Intelligent Systems
Empirical Bernstein stopping

Proceedings of the 25th international conference on Machine learning
Accelerated Neural Evolution through Cooperatively Coevolved Synapses

The Journal of Machine Learning Research
Efficient covariance matrix update for variable metric evolution strategies

Machine Learning
Uncertainty handling CMA-ES for reinforcement learning

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion

IEEE Transactions on Evolutionary Computation - Special issue on computational finance and economics
Integrating techniques from statistical ranking into evolutionary algorithms

EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing

Computing label-constraint reachability in graph databases

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
New uncertainty handling strategies in multi-objective evolutionary optimization

PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II
Bandit-based estimation of distribution algorithms for noisy optimization: rigorous runtime analysis

LION'10 Proceedings of the 4th international conference on Learning and intelligent optimization
Handling expensive optimization with large noise

Proceedings of the 11th workshop proceedings on Foundations of genetic algorithms
How to promote generalisation in evolutionary robotics: the ProGAb approach

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Using the uncertainty handling CMA-ES for finding robust optima

Proceedings of the 13th annual conference on Genetic and evolutionary computation
The effects of selection on noisy fitness optimization

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Designing artificial tetris players with evolution strategies and racing

Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Adaptive noisy optimization

EvoApplicatons'10 Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I
APRIL: active preference learning-based reinforcement learning

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Reducing the learning time of tetris in evolution strategies

EA'11 Proceedings of the 10th international conference on Artificial Evolution
Online learning with multiple kernels: A review

Neural Computation
S-Race: a multi-objective racing algorithm

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Lazy paired hyper-parameter tuning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Uncertainty arises in reinforcement learning from various sources, and therefore it is necessary to consider statistics based on several roll-outs for evaluating behavioral policies. We add an adaptive uncertainty handling based on Hoeffding and empirical Bernstein races to the CMA-ES, a variable metric evolution strategy proposed for direct policy search. The uncertainty handling adjusts individually the number of episodes considered for the evaluation of a policy. The performance estimation is kept just accurate enough for a sufficiently good ranking of candidate policies, which is in turn sufficient for the CMA-ES to find better solutions. This increases the learning speed as well as the robustness of the algorithm.