Empirical Studies in Action Selection with Reinforcement Learning

Authors:
Shimon Whiteson;Matthew E. Taylor;Peter Stone
Affiliations:
Department of Computer Sciences, University of Texas, Austin, USA;Department of Computer Sciences, University of Texas, Austin, USA;Department of Computer Sciences, University of Texas, Austin, USA
Venue:
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Year:
2007

Citing 35
Cited 11

Radial basis functions for multivariable interpolation: a review

Algorithms for approximation
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

Machine Learning
Learning in embedded systems

Learning in embedded systems
Genetic Reinforcement Learning for Neurocontrol Problems

Machine Learning - Special issue on genetic algorithms
Learning and evolution in neural networks

Adaptive Behavior
Efficient reinforcement learning through symbiotic evolution

Machine Learning - Special issue on reinforcement learning
Reinforcement learning with replacing eligibility traces

Machine Learning - Special issue on reinforcement learning
Elevator Group Control Using Multiple Reinforcement Learning Agents

Machine Learning
Gradient descent for general reinforcement learning

Proceedings of the 1998 conference on Advances in neural information processing systems II
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence

Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Reinforced Genetic Programming

Genetic Programming and Evolvable Machines
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Evolving neural networks through augmenting topologies

Evolutionary Computation
The Vision of Autonomic Computing

Computer
Practical Reinforcement Learning in Continuous Spaces

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Averaging Efficiently in the Presence of Noise

PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
Genetic Programming And Multi-agent Layered Learning By Reinforcements

GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
PEGASUS: A policy search method for large MDPs and POMDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Unifying Learning with Evolution Through Baldwinian Evolution and Lamarckism

Advances in Computational Intelligence and Learning: Methods and Applications
Evolving Soccer Keepaway Players Through Task Decomposition

Machine Learning
Co-evolving recurrent neurons learn deep memory POMDPs

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Utility Functions in Autonomic Systems

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Comparing evolutionary and temporal difference methods in a reinforcement learning domain

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Machine learning for fast quadrupedal locomotion

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Sample-efficient evolutionary function approximation for reinforcement learning

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Samuel meets Amarel: automating value function approximation using global state space analysis

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Competitive coevolution through evolutionary complexification

Journal of Artificial Intelligence Research
Keepaway soccer: from machine learning testbed to benchmark

RoboCup 2005
Neural fitted q iteration – first experiences with a data efficient neural reinforcement learning method

ECML'05 Proceedings of the 16th European conference on Machine Learning
Function approximation via tile coding: automating parameter choice

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Bandit problems and the exploration/exploitation tradeoff

IEEE Transactions on Evolutionary Computation
Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go

IEEE Transactions on Evolutionary Computation

Towards efficient online reinforcement learning using neuroevolution

Proceedings of the 10th annual conference on Genetic and evolutionary computation
Analysis of an evolutionary reinforcement learning method in a multiagent domain

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Evolving Neural Networks for Online Reinforcement Learning

Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Evolving an autonomous agent for non-Markovian reinforcement learning

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Levels and Types of Action Selection: The Action Selection Soup

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Evolving Memory Cell Structures for Sequence Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Dimensionality effects on the Markov property in shape memory alloy hysteretic environment

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
2010 Special Issue: Comparison of behavior-based and planning techniques on the small robot maze exploration problem

Neural Networks
The neuronal replicator hypothesis

Neural Computation
Evaluating Q-learning policies for multi-objective foraging task in a multi-agent environment

ICIRA'10 Proceedings of the Third international conference on Intelligent robotics and applications - Volume Part II
Exploration strategies for learning in multi-agent foraging

SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

To excel in challenging tasks, intelligent agents need sophisticated mechanisms for action selection: they need policies that dictate what action to take in each situation. Reinforcement learning (RL) algorithms are designed to learn such policies given only positive and negative rewards. Two contrasting approaches to RL that are currently in popular use are temporal difference (TD) methods, which learn value functions, and evolutionary methods, which optimize populations of candidate policies. Both approaches have had practical successes but few studies have directly compared them. Hence, there are no general guidelines describing their relative strengths and weaknesses. In addition, there has been little cross-collaboration, with few attempts to make them work together or to apply ideas from one to the other. In this article we aim to address these shortcomings via three empirical studies that compare these methods and investigate new ways of making them work together.First, we compare the two approaches in a benchmark task and identify variations of the task that isolate factors critical to the performance of each method. Second, we investigate ways to make evolutionary algorithms excel at on-line tasks by borrowing exploratory mechanisms traditionally used by TD methods. We present empirical results demonstrating a dramatic performance improvement. Third, we explore a novel way of making evolutionary and TD methods work together by using evolution to automatically discover good representations for TD function approximators. We present results demonstrating that this novel approach can outperform both TD and evolutionary methods alone.