Technical Note: \cal Q-Learning
Machine Learning
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
Dynamic Programming and Optimal Control, Two Volume Set
Dynamic Programming and Optimal Control, Two Volume Set
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Kernel-Based Reinforcement Learning
Machine Learning
Variable Resolution Discretization in Optimal Control
Machine Learning
A reinforcement learning adaptive fuzzy controller for robots
Fuzzy Sets and Systems - Theme: Modeling and control
Interpolation-based Q-learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Tree-Based Batch Mode Reinforcement Learning
The Journal of Machine Learning Research
Finite time bounds for sampling based fitted value iteration
ICML '05 Proceedings of the 22nd international conference on Machine learning
Function approximation via tile coding: automating parameter choice
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Fuzzy inference system learning by reinforcement methods
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A fuzzy reinforcement learning approach to power control in wireless transmitters
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
IEEE Transactions on Fuzzy Systems
Approximate dynamic programming with a fuzzy parameterization
Automatica (Journal of IFAC)
CIRA'09 Proceedings of the 8th IEEE international conference on Computational intelligence in robotics and automation
Hi-index | 0.00 |
Reinforcement learning (RL) is a widely used learning paradigm for adaptive agents. There exist several convergent and consistent RL algorithms which have been intensively studied. In their original form, these algorithms require that the environment states and agent actions take values in a relatively small discrete set. Fuzzy representations for approximate, model-free RL have been proposed in the literature for the more difficult case where the state-action space is continuous. In this work, we propose a fuzzy approximation architecture similar to those previously used for Q-learning, but we combine it with the model-based Q-value iteration algorithm. We prove that the resulting algorithm converges. We also give a modified, asynchronous variant of the algorithm that converges at least as fast as the original version. An illustrative simulation example is provided.