Training Reinforcement Neurocontrollers Using the Polytope Algorithm

Authors:
Aristidis Likas;Isaac E. Lagaris
Affiliations:
Department of Computer Science, University of Ioannina, P.O. Box. 1186 – GR 45110 Ioannina, Greece. e-mail: arly@cs.uoi.gr;Department of Computer Science, University of Ioannina, P.O. Box. 1186 – GR 45110 Ioannina, Greece. e-mail: arly@cs.uoi.gr
Venue:
Neural Processing Letters
Year:
1999

Citing 4
Cited 1

Technical Note: \cal Q-Learning

Machine Learning
Genetic Reinforcement Learning for Neurocontrol Problems

Machine Learning - Special issue on genetic algorithms
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Reinforcement learning for an ART-based fuzzy adaptive learning control network

IEEE Transactions on Neural Networks

Policy search using paired comparisons

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorithm to adjust the weights of the action network so that a simple direct measure of the training performance is maximized. Experimental results from the application of the method to the pole balancing problem indicate improved training performance compared with critic-based and genetic reinforcement approaches.