Gradient based algorithms with loss functions and kernels for improved on-policy control

Authors:
Matthew Robards;Peter Sunehag
Affiliations:
Australian National University, Nicta, Australia;Australian National University, Nicta, Australia
Venue:
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Year:
2011

Citing 6
Cited 0

Gradient descent for general reinforcement learning

Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Fast gradient-descent methods for temporal-difference learning with linear function approximation

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sparse Kernel-SARSA(λ) with an eligibility trace

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation --- one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper.