Global versus local constructive function approximation for on-line reinforcement learning

Authors:
Peter Vamplew;Robert Ollington
Affiliations:
School of Computing, University of Tasmania, Hobart, Tasmania, Australia;School of Computing, University of Tasmania, Hobart, Tasmania, Australia
Venue:
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Year:
2005

Citing 9
Cited 1

The cascade-correlation learning architecture

Advances in neural information processing systems 2
A resource-allocating network for function interpolation

Neural Computation
Temporal difference learning and TD-Gammon

Communications of the ACM
Investigation of the CasCor family of learning algorithms

Neural Networks
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Q-Learning with Hidden-Unit Restarting

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Feedforward Neural Networks in Reinforcement Learning Applied to High-Dimensional Motor Control

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
The cascade-correlation learning: a projection pursuit learning perspective

IEEE Transactions on Neural Networks

A reinforcement learning framework for online data migration in hierarchical storage systems

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In order to scale to large state-spaces, reinforcement learning (RL) algorithms need to apply function approximation techniques. Research on function approximation for RL has so far focused either on global methods with a static structure or on constructive architectures using locally responsive units. The former, whilst achieving some notable successes, has also failed on some relatively simple tasks. The locally constructive approach is more stable, but may scale poorly to higher-dimensional inputs. This paper examines two globally constructive algorithms based on the Cascor supervised-learning algorithm. These algorithms are applied within the sarsa RL algorithm, and their performance compared against a multi-layer perceptron and a locally constructive algorithm (the Resource Allocating Network). It is shown that the globally constructive algorithms are less stable, but that on some tasks they achieve similar performance to the RAN, whilst generating more compact solutions.