Radial basis functions for multivariable interpolation: a review
Algorithms for approximation
Learning in embedded systems
Genetic Reinforcement Learning for Neurocontrol Problems
Machine Learning - Special issue on genetic algorithms
Learning and evolution in neural networks
Adaptive Behavior
Efficient reinforcement learning through symbiotic evolution
Machine Learning - Special issue on reinforcement learning
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
Gradient descent for general reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Reinforced Genetic Programming
Genetic Programming and Evolvable Machines
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Evolving neural networks through augmenting topologies
Evolutionary Computation
The Vision of Autonomic Computing
Computer
Practical Reinforcement Learning in Continuous Spaces
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Averaging Efficiently in the Presence of Noise
PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
Genetic Programming And Multi-agent Layered Learning By Reinforcements
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
PEGASUS: A policy search method for large MDPs and POMDPs
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Unifying Learning with Evolution Through Baldwinian Evolution and Lamarckism
Advances in Computational Intelligence and Learning: Methods and Applications
Evolving Soccer Keepaway Players Through Task Decomposition
Machine Learning
Co-evolving recurrent neurons learn deep memory POMDPs
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Utility Functions in Autonomic Systems
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Comparing evolutionary and temporal difference methods in a reinforcement learning domain
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Evolutionary Function Approximation for Reinforcement Learning
The Journal of Machine Learning Research
Machine learning for fast quadrupedal locomotion
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Sample-efficient evolutionary function approximation for reinforcement learning
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Samuel meets Amarel: automating value function approximation using global state space analysis
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Competitive coevolution through evolutionary complexification
Journal of Artificial Intelligence Research
ECML'05 Proceedings of the 16th European conference on Machine Learning
Function approximation via tile coding: automating parameter choice
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Bandit problems and the exploration/exploitation tradeoff
IEEE Transactions on Evolutionary Computation
IEEE Transactions on Evolutionary Computation
Towards efficient online reinforcement learning using neuroevolution
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Analysis of an evolutionary reinforcement learning method in a multiagent domain
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Evolving Neural Networks for Online Reinforcement Learning
Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Evolving an autonomous agent for non-Markovian reinforcement learning
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Levels and Types of Action Selection: The Action Selection Soup
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Evolving Memory Cell Structures for Sequence Learning
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Dimensionality effects on the Markov property in shape memory alloy hysteretic environment
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
The neuronal replicator hypothesis
Neural Computation
Evaluating Q-learning policies for multi-objective foraging task in a multi-agent environment
ICIRA'10 Proceedings of the Third international conference on Intelligent robotics and applications - Volume Part II
Exploration strategies for learning in multi-agent foraging
SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part II
Hi-index | 0.00 |
To excel in challenging tasks, intelligent agents need sophisticated mechanisms for action selection: they need policies that dictate what action to take in each situation. Reinforcement learning (RL) algorithms are designed to learn such policies given only positive and negative rewards. Two contrasting approaches to RL that are currently in popular use are temporal difference (TD) methods, which learn value functions, and evolutionary methods, which optimize populations of candidate policies. Both approaches have had practical successes but few studies have directly compared them. Hence, there are no general guidelines describing their relative strengths and weaknesses. In addition, there has been little cross-collaboration, with few attempts to make them work together or to apply ideas from one to the other. In this article we aim to address these shortcomings via three empirical studies that compare these methods and investigate new ways of making them work together.First, we compare the two approaches in a benchmark task and identify variations of the task that isolate factors critical to the performance of each method. Second, we investigate ways to make evolutionary algorithms excel at on-line tasks by borrowing exploratory mechanisms traditionally used by TD methods. We present empirical results demonstrating a dramatic performance improvement. Third, we explore a novel way of making evolutionary and TD methods work together by using evolution to automatically discover good representations for TD function approximators. We present results demonstrating that this novel approach can outperform both TD and evolutionary methods alone.