Evolution of reward functions for reinforcement learning

Authors:
Scott Niekum;Lee Spector;Andrew Barto
Affiliations:
University of Massachusetts, Amherst, MA, USA;Hampshire College, Amherst, MA, USA;University of Massachusetts, Amherst, MA, USA
Venue:
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Year:
2011

Citing 8
Cited 3

Adaptive individuals in evolving populations: models and algorithms

Adaptive individuals in evolving populations: models and algorithms
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Genetic Programming and Autoconstructive Evolution with the Push Programming Language

Genetic Programming and Evolvable Machines
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
The Push3 execution stack and the evolution of control

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Guest editorial: special issue on parallel and distributed evolutionary algorithms, part two

Genetic Programming and Evolvable Machines
Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective

IEEE Transactions on Autonomous Mental Development
Genetic Programming for Reward Function Search

IEEE Transactions on Autonomous Mental Development

Expressive genetic programming: tutorial: 2012 genetic and evolutionary computation conference (GECCO-2012)

Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
Expressive genetic programming

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
A reinforcement learning-based routing for delay tolerant networks

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The reward functions that drive reinforcement learning systems are generally derived directly from the descriptions of the problems that the systems are being used to solve. In some problem domains, however, alternative reward functions may allow systems to learn more quickly or more effectively. Here we describe work on the use of genetic programming to find novel reward functions that improve learning system performance. We briefly present the core concepts of our approach, our motivations in developing it, and reasons to believe that the approach has promise for the production of highly successful adaptive technologies. Experimental results are presented and analyzed in our full report [3].