Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
An analysis of reinforcement learning with function approximation
Proceedings of the 25th international conference on Machine learning
Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
Variable resolution discretization for high-accuracy solutions of optimal control problems
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Multiresolution state-space discretization method for Q-learning
ACC'09 Proceedings of the 2009 conference on American Control Conference
Algorithms for Reinforcement Learning
Algorithms for Reinforcement Learning
Continuous upper confidence trees
LION'05 Proceedings of the 5th international conference on Learning and Intelligent Optimization
Hi-index | 0.00 |
Reinforcement learning is challenging if state and action spaces are continuous. The discretization of state and action spaces and real-time adaptation of the discretization are critical issues in reinforcement learning problems. In our contribution we consider the adaptive discretization, and introduce a sparse gradient-based direct policy search method. We address the issue of efficient states/actions selection in the gradient-based direct policy search based on imposing sparsity through the L1 penalty term. We propose to start learning with a fine discretization of state space and to induce sparsity via the L1 norm. We compare the proposed approach to state-of-the art methods, such as progressive widening Q-learning which updates the discretization of the states adaptively, and to classic as well as sparse Q-learning with linear function approximation. We demonstrate by our experiments on standard reinforcement learning challenges that the proposed approach is efficient.