The Racing Algorithm: Model Selection for Lazy Learners
Artificial Intelligence Review - Special issue on lazy learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Averaging Efficiently in the Presence of Noise
PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
A Racing Algorithm for Configuring Metaheuristics
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Making Driver Modeling Attractive
IEEE Intelligent Systems
Evolutionary Function Approximation for Reinforcement Learning
The Journal of Machine Learning Research
Neurocomputing
Evolutionary reinforcement learning of artificial neural networks
International Journal of Hybrid Intelligent Systems - Hybridization of Intelligent Systems
Proceedings of the 25th international conference on Machine learning
Accelerated Neural Evolution through Cooperatively Coevolved Synapses
The Journal of Machine Learning Research
Uncertainty handling CMA-ES for reinforcement learning
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
IEEE Transactions on Evolutionary Computation - Special issue on computational finance and economics
Integrating techniques from statistical ranking into evolutionary algorithms
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Computing label-constraint reachability in graph databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
New uncertainty handling strategies in multi-objective evolutionary optimization
PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II
Bandit-based estimation of distribution algorithms for noisy optimization: rigorous runtime analysis
LION'10 Proceedings of the 4th international conference on Learning and intelligent optimization
Handling expensive optimization with large noise
Proceedings of the 11th workshop proceedings on Foundations of genetic algorithms
How to promote generalisation in evolutionary robotics: the ProGAb approach
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Using the uncertainty handling CMA-ES for finding robust optima
Proceedings of the 13th annual conference on Genetic and evolutionary computation
The effects of selection on noisy fitness optimization
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Designing artificial tetris players with evolution strategies and racing
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
EvoApplicatons'10 Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I
APRIL: active preference learning-based reinforcement learning
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Reducing the learning time of tetris in evolution strategies
EA'11 Proceedings of the 10th international conference on Artificial Evolution
Online learning with multiple kernels: A review
Neural Computation
S-Race: a multi-objective racing algorithm
Proceedings of the 15th annual conference on Genetic and evolutionary computation
Lazy paired hyper-parameter tuning
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Uncertainty arises in reinforcement learning from various sources, and therefore it is necessary to consider statistics based on several roll-outs for evaluating behavioral policies. We add an adaptive uncertainty handling based on Hoeffding and empirical Bernstein races to the CMA-ES, a variable metric evolution strategy proposed for direct policy search. The uncertainty handling adjusts individually the number of episodes considered for the evaluation of a policy. The performance estimation is kept just accurate enough for a sufficiently good ranking of candidate policies, which is in turn sufficient for the CMA-ES to find better solutions. This increases the learning speed as well as the robustness of the algorithm.