Training Reinforcement Neurocontrollers Using the Polytope Algorithm

  • Authors:
  • Aristidis Likas;Isaac E. Lagaris

  • Affiliations:
  • Department of Computer Science, University of Ioannina, P.O. Box. 1186 – GR 45110 Ioannina, Greece. e-mail: arly@cs.uoi.gr;Department of Computer Science, University of Ioannina, P.O. Box. 1186 – GR 45110 Ioannina, Greece. e-mail: arly@cs.uoi.gr

  • Venue:
  • Neural Processing Letters
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorithm to adjust the weights of the action network so that a simple direct measure of the training performance is maximized. Experimental results from the application of the method to the pole balancing problem indicate improved training performance compared with critic-based and genetic reinforcement approaches.