Implementing Temporal-Difference Learning with the Scaled Conjugate Gradient Algorithm

Authors:
Tasos Falas;Andreas Stafylopatis
Affiliations:
Aff1 Aff2;School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
Venue:
Neural Processing Letters
Year:
2005

Citing 7
Cited 2

Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Practical Issues in Temporal Difference Learning

Machine Learning
Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

Neural Networks
A counterexample to temporal differences learning

Neural Computation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Time Series Analysis: Forecasting and Control

Time Series Analysis: Forecasting and Control
Learning to Predict by the Methods of Temporal Differences

Machine Learning

Realization of an Improved Adaptive Neuro-Fuzzy Inference System in DSP

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
A hybrid neural network and ARIMA model for water quality time series prediction

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates the use of the scaled conjugate gradient (SCG) algorithm in temporal-difference (TD) learning for time series prediction. Special emphasis is given on the implementation details, after examining the theoretical background of the algorithm and the learning methodology and how these could be combined. Simple time series (linear, sinusoidal, etc.) as well as more complex ones, coming from real data, are used to examine the behavior of this novel combination of learning algorithm and methodology. Preliminary experimental results indicate that the implementation as presented in this paper indeed works, but the performance (in terms of learning speed and generalization ability) of TD learning using the SCG algorithm is not as good as expected, at least on the representative problems examined. An attempt to rationalize these results is presented.