Gradient based algorithms with loss functions and kernels for improved on-policy control

  • Authors:
  • Matthew Robards;Peter Sunehag

  • Affiliations:
  • Australian National University, Nicta, Australia;Australian National University, Nicta, Australia

  • Venue:
  • EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation --- one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper.