Efficient reinforcement learning
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Mean-field theory for batched TD (&lgr;)
Neural Computation
Colearning in Differential Games
Machine Learning
Convergence analysis of temporal-difference learning algorithms with linear function approximation
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Analytical Mean Squared Error Curves for Temporal DifferenceLearning
Machine Learning
On the Asymptotic Behaviour of a Constant Stepsize Temporal-Difference Learning Algorithm
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Planning, learning and coordination in multiagent decision processes
TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning
Discrete Event Dynamic Systems
On the convergence of stochastic iterative dynamic programming algorithms
Neural Computation
Experimental analysis of eligibility traces strategies in temporal difference learning
International Journal of Knowledge Engineering and Soft Data Paradigms
A spiking neural network model of an actor-critic learning agent
Neural Computation
Reinforcement distribution in fuzzy Q-learning
Fuzzy Sets and Systems
Reinforcement Learning: A Tutorial Survey and Recent Advances
INFORMS Journal on Computing
Fast gradient-descent methods for temporal-difference learning with linear function approximation
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Efficient reinforcement learning using recursive least-squares methods
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research
Dynamics of temporal difference learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Learning to act using real-time dynamic programming
Artificial Intelligence
Reinforcement learning and adaptive dynamic programming for feedback control
IEEE Circuits and Systems Magazine
Adaptive state space partitioning for reinforcement learning
Engineering Applications of Artificial Intelligence
Reinforcement learning of competitive and cooperative skills in soccer agents
Applied Soft Computing
An information-spectrum approach to analysis of return maximization in reinforcement learning
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
Monte Carlo matrix inversion policy evaluation
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Brief paper: Average cost temporal-difference learning
Automatica (Journal of IFAC)
The Journal of Supercomputing
Reinforcement learning algorithms with function approximation: Recent advances and applications
Information Sciences: an International Journal
Hi-index | 0.00 |
The method of temporal differences (TD) is one way of making consistent predictions about the future. This paper uses some analysis of Watkins (1989) to extend a convergence theorem due to Sutton (1988) from the case which only uses information from adjacent time steps to that involving information from arbitrary ones.It also considers how this version of TD behaves in the face of linearly dependent representations for states—demonstrating that it still converges, but to a different answer from the least mean squares algorithm. Finally it adapts Watkins' theorem that \cal Q-learning, his closely related prediction and action learning method, converges with probability one, to demonstrate this strong form of convergence for a slightly modified version of TD.