Proceedings of the seventh international conference (1990) on Machine learning
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Reinforcement learning for robots using neural networks
Reinforcement learning for robots using neural networks
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Linear least-squares algorithms for temporal difference learning
Machine Learning - Special issue on reinforcement learning
Learning evaluation functions for global optimization and Boolean satisfiability
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Nonparametric model-based reinforcement learning
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Learning evaluation functions for global optimization
Learning evaluation functions for global optimization
Least Squares Policy Evaluation Algorithms with Linear Function Approximation
Discrete Event Dynamic Systems
Least-squares policy iteration
The Journal of Machine Learning Research
TD(λ) networks: temporal-difference networks with eligibility traces
ICML '05 Proceedings of the 22nd international conference on Machine learning
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning
Discrete Event Dynamic Systems
Automatic basis function construction for approximate dynamic programming and reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Performance Loss Bounds for Approximate Value Iteration with State Aggregation
Mathematics of Operations Research
Proceedings of the 24th international conference on Machine learning
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
Artificial Intelligence
A Kernel-Based Reinforcement Learning Approach to Dynamic Behavior Modeling of Intrusion Detection
ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
Projected equation methods for approximate solution of large linear systems
Journal of Computational and Applied Mathematics
Regularization and feature selection in least-squares temporal difference learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Fast gradient-descent methods for temporal-difference learning with linear function approximation
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Online exploration in least-squares policy iteration
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Reordering Sparsification of Kernel Machines in Approximate Policy Iteration
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part II
Reinforcement Learning Control of a Real Mobile Robot Using Approximate Policy Iteration
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part III
International Journal of Robotics Research
Incremental least-squares temporal difference learning
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Optimal Online Learning Procedures for Model-Free Policy Evaluation
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Efficient reinforcement learning using recursive least-squares methods
Journal of Artificial Intelligence Research
Exploring compact reinforcement-learning representations with linear regression
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Reinforcement learning of competitive and cooperative skills in soccer agents
Applied Soft Computing
Journal of Artificial Intelligence Research
The Journal of Machine Learning Research
A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning
ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
Mathematics of Operations Research
Actor-Critic algorithm based on incremental least-squares temporal difference with eligibility trace
ICIC'11 Proceedings of the 7th international conference on Advanced Intelligent Computing Theories and Applications: with aspects of artificial intelligence
ℓ1-Penalized projected bellman residual
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Recursive least-squares learning with eligibility traces
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Unified inter and intra options learning using policy gradient methods
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
A novel feature sparsification method for kernel-based approximate policy iteration
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
A rapid sparsification method for kernel machines in approximate policy iteration
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
An online kernel-based clustering approach for value function approximation
SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Adaptive reservoir computing through evolution and learning
Neurocomputing
Using approximate dynamic programming to optimize admission control in cloud computing environment
Proceedings of the Winter Simulation Conference
Identifying effective policies in approximate dynamic programming: beyond regression
Proceedings of the Winter Simulation Conference
An efficient L2-norm regularized least-squares temporal difference learning algorithm
Knowledge-Based Systems
Reinforcement learning algorithms with function approximation: Recent advances and applications
Information Sciences: an International Journal
Policy oscillation is overshooting
Neural Networks
Hi-index | 0.00 |
TD.λ/ is a popular family of algorithms for approximate policy evaluation in large MDPs. TD.λ/ works by incrementally updating the value function after each observed transition. It has two major drawbacks: it may make inefficient use of data, and it requires the user to manually tune a stepsize schedule for good performance. For the case of linear value function approximations and λ = 0, the Least-Squares TD (LSTD) algorithm of Bradtke and Barto (1996, Machine learning, 22:1–3, 33–57) eliminates all stepsize parameters and improves data efficiency.This paper updates Bradtke and Barto's work in three significant ways. First, it presents a simpler derivation of the LSTD algorithm. Second, it generalizes from λ = 0 to arbitrary values of λ; at the extreme of λ = 1, the resulting new algorithm is shown to be a practical, incremental formulation of supervised linear regression. Third, it presents a novel and intuitive interpretation of LSTD as a model-based reinforcement learning technique.