Radial basis functions for multivariable interpolation: a review
Algorithms for approximation
Proceedings of the seventh international conference (1990) on Machine learning
Technical Note: \cal Q-Learning
Machine Learning
Genetic Reinforcement Learning for Neurocontrol Problems
Machine Learning - Special issue on genetic algorithms
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Efficient reinforcement learning through symbiotic evolution
Machine Learning - Special issue on reinforcement learning
Co-Evolution in the Successful Learning of Backgammon Strategy
Machine Learning
Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
Gradient descent for general reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer
Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Brains, Behavior and Robotics
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning
Evolving Neural Control Systems
IEEE Expert: Intelligent Systems and Their Applications
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Evolving neural networks through augmenting topologies
Evolutionary Computation
Practical Reinforcement Learning in Continuous Spaces
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Averaging Efficiently in the Presence of Noise
PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
Dynamic Programming
Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning)
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
Least-squares policy iteration
The Journal of Machine Learning Research
Evolving Soccer Keepaway Players Through Task Decomposition
Machine Learning
Co-evolving recurrent neurons learn deep memory POMDPs
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
A theoretical analysis of Model-Based Interval Estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents
Evolutionary Computation
Automatic basis function construction for approximate dynamic programming and reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Comparing evolutionary and temporal difference methods in a reinforcement learning domain
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Learning tetris using the noisy cross-entropy method
Neural Computation
Evolutionary Function Approximation for Reinforcement Learning
The Journal of Machine Learning Research
Generating large-scale neural networks through discovering geometric regularities
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Evolving neural networks for fractured domains
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Analysis of an evolutionary reinforcement learning method in a multiagent domain
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation
ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
An empirical analysis of value function-based and policy search reinforcement learning
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
A comparison between cellular encoding and direct encoding for genetic neural networks
GECCO '96 Proceedings of the 1st annual conference on Genetic and evolutionary computation
Samuel meets Amarel: automating value function approximation using global state space analysis
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Competitive coevolution through evolutionary complexification
Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Solving non-Markovian control tasks with neuroevolution
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Model-based exploration in continuous state spaces
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
On the complexity of solving Markov decision problems
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Efficient non-linear control through neuroevolution
ECML'06 Proceedings of the 17th European conference on Machine Learning
ECML'05 Proceedings of the 16th European conference on Machine Learning
IEEE Transactions on Evolutionary Computation
Sustaining behavioral diversity in NEAT
Proceedings of the 12th annual conference on Genetic and evolutionary computation
APRIL: active preference learning-based reinforcement learning
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Adaptive reservoir computing through evolution and learning
Neurocomputing
Hi-index | 0.04 |
Temporal difference and evolutionary methods are two of the most common approaches to solving reinforcement learning problems. However, there is little consensus on their relative merits and there have been few empirical studies that directly compare their performance. This article aims to address this shortcoming by presenting results of empirical comparisons between Sarsa and NEAT, two representative methods, in mountain car and keepaway, two benchmark reinforcement learning tasks. In each task, the methods are evaluated in combination with both linear and nonlinear representations to determine their best configurations. In addition, this article tests two specific hypotheses about the critical factors contributing to these methods' relative performance: (1) that sensor noise reduces the final performance of Sarsa more than that of NEAT, because Sarsa's learning updates are not reliable in the absence of the Markov property and (2) that stochasticity, by introducing noise in fitness estimates, reduces the learning speed of NEAT more than that of Sarsa. Experiments in variations of mountain car and keepaway designed to isolate these factors confirm both these hypotheses.