Technical Note: \cal Q-Learning
Machine Learning
The asymptotic convergence-rate of Q-learning
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
The O.D. E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
SIAM Journal on Control and Optimization
Finite-sample convergence rates for Q-learning and indirect algorithms
Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Interpolation-based Q-learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
PAC model-free reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Towards the Automatic Learning of Reflex Modulation for Mobile Robot Navigation
IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Value Function Based Reinforcement Learning in Changing Markovian Environments
The Journal of Machine Learning Research
On step sizes, stochastic shortest paths, and survival probabilities in reinforcement learning
Proceedings of the 40th Conference on Winter Simulation
An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem
Mathematics of Operations Research
Reinforcement Learning: A Tutorial Survey and Recent Advances
INFORMS Journal on Computing
Adaptive stochastic resource control: a machine learning approach
Journal of Artificial Intelligence Research
Recursive Adaptation of Stepsize Parameter for Non-stationary Environments
PRIMA '09 Proceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems
Reinforcement Learning in Finite MDPs: PAC Analysis
The Journal of Machine Learning Research
Autonomous Agents and Multi-Agent Systems
Learning hybridization strategies in evolutionary algorithms
Intelligent Data Analysis
Coordinated learning in multiagent MDPs with infinite state-space
Autonomous Agents and Multi-Agent Systems
Multi-goal Q-learning of cooperative teams
Expert Systems with Applications: An International Journal
The evolution of rules for conflicts resolution in self-organizing teams
Expert Systems with Applications: An International Journal
Towards finite-sample convergence of direct reinforcement learning
ECML'05 Proceedings of the 16th European conference on Machine Learning
Recursive adaptation of stepsize parameter for non-stationary environments
ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Adaption of stepsize parameter using newton's method
PRIMA'11 Proceedings of the 14th international conference on Agents in Principle, Agents in Practice
Reputation-Aware learning for SLA negotiation
IFIP'12 Proceedings of the 2012 international conference on Networking
The Journal of Machine Learning Research
A multi-agent control architecture for a robotic wheelchair
Applied Bionics and Biomechanics
Hi-index | 0.00 |
In this paper we derive convergence rates for Q-learning. We show an interesting relationship between the convergence rate and the learning rate used in Q-learning. For a polynomial learning rate, one which is 1/tω at time t where ω∈(1/2,1), we show that the convergence rate is polynomial in 1/(1-γ), where γ is the discount factor. In contrast we show that for a linear learning rate, one which is 1/t at time t, the convergence rate has an exponential dependence on 1/(1-γ). In addition we show a simple example that proves this exponential behavior is inherent for linear learning rates.