Sparse gradient-based direct policy search

Authors:
Nataliya Sokolovska
Affiliations:
Department of Computing, Macquarie University, Sydney, Australia
Venue:
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
Year:
2012

Citing 9
Cited 0

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation

Journal of Artificial Intelligence Research
Variable resolution discretization for high-accuracy solutions of optimal control problems

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Multiresolution state-space discretization method for Q-learning

ACC'09 Proceedings of the 2009 conference on American Control Conference
Algorithms for Reinforcement Learning

Algorithms for Reinforcement Learning
Continuous upper confidence trees

LION'05 Proceedings of the 5th international conference on Learning and Intelligent Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is challenging if state and action spaces are continuous. The discretization of state and action spaces and real-time adaptation of the discretization are critical issues in reinforcement learning problems. In our contribution we consider the adaptive discretization, and introduce a sparse gradient-based direct policy search method. We address the issue of efficient states/actions selection in the gradient-based direct policy search based on imposing sparsity through the L1 penalty term. We propose to start learning with a fine discretization of state space and to induce sparsity via the L1 norm. We compare the proposed approach to state-of-the art methods, such as progressive widening Q-learning which updates the discretization of the states adaptively, and to classic as well as sparse Q-learning with linear function approximation. We demonstrate by our experiments on standard reinforcement learning challenges that the proposed approach is efficient.