Improving reinforcement learning function approximators via neuroevolution

Authors:
Shimon Whiteson
Affiliations:
University of Texas at Austin, Austin, TX
Venue:
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Year:
2005

Citing 2
Cited 3

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Evolving neural networks through augmenting topologies

Evolutionary Computation

Transfer learning through indirect encoding

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Evolving Static Representations for Task Transfer

The Journal of Machine Learning Research
Constraining connectivity to encourage modularity in HyperNEAT

Proceedings of the 13th annual conference on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning problems are commonly tackled with temporal difference methods, which estimate the long-term value of taking each action in each state. In most problems of real-world interest, learning this value function requires a function approximator. However, the feasibility of using function approximators depends on the ability of the human designer to select an appropriate representation for the value function. My thesis presents a new approach to function approximation that automates some of these difficult design choices by coupling temporal difference methods with policy search methods such as evolutionary computation. It also presents a particular implementation which combines NEAT, a neuroevolutionary policy search method, and Q-learning, a popular temporal difference method, to yield a new method called NEAT+Q that automatically learns effective representations for neural network function approximators. Empirical results in a server job scheduling task demonstrate that NEAT+Q can outperform both NEAT and Q-learning with manually designed neural networks.