Feedforward Neural Networks in Reinforcement Learning Applied to High-Dimensional Motor Control

Authors:
Rémi Coulom
Affiliations:
-
Venue:
ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
Year:
2002

Citing 10
Cited 3

Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

Neural Networks
Temporal difference learning and TD-Gammon

Communications of the ACM
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Learning to Predict by the Methods of Temporal Differences

Machine Learning
How to Train Neural Networks

Neural Networks: Tricks of the Trade, this book is an outgrowth of a 1996 NIPS workshop
Dynamic Programming

Dynamic Programming
Reinforcement Learning in Continuous Time and Space

Neural Computation
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

A Reinforcement Learning Framework for Parameter Control in Computer Vision Applications

CRV '04 Proceedings of the 1st Canadian Conference on Computer and Robot Vision
Teaching a robot to perform tasks with voice commands

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
Global versus local constructive function approximation for on-line reinforcement learning

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Local linear function approximators are often preferred to feedforward neural networks to estimate value functions in reinforcement learning. Still, motor tasks usually solved by this kind of methods have a low-dimensional state space. This article demonstrates that feed-forward neural networks can be applied successfully to high-dimensional problems. The main difficulties of using backpropagation networks in reinforcement learning are reviewed, and a simple method to perform gradient descent efficiently is proposed. It was tested successfully on an original task of learning to swim by a complex simulated articulated robot, with 4 control variables and 12 independent state variables.