Linear least-squares algorithms for temporal difference learning

Authors:
Steven J. Bradtke;Andrew G. Barto
Affiliations:
-;-
Venue:
Machine Learning - Special issue on reinforcement learning
Year:
1996

Citing 0
Cited 80

Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Relative Loss Bounds for Temporal-Difference Learning

Machine Learning
Least Squares Policy Evaluation Algorithms with Linear Function Approximation

Discrete Event Dynamic Systems
Reinforcement Learning in Situated Agents: Theoretical and Practical Solutions

EWLR-8 Proceedings of the 8th European Workshop on Learning Robots: Advances in Robot Learning
Least-Squares Methods in Reinforcement Learning for Control

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Chess Neighborhoods, Function Combination, and Reinforcement Learning

CG '00 Revised Papers from the Second International Conference on Computers and Games
Solving factored MDPs using non-homogeneous partitions

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Least-squares policy iteration

The Journal of Machine Learning Research
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

The Journal of Machine Learning Research
Extending XCSF beyond linear approximation

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Improving generalization in the XCSF classifier system using linear least-squares

GECCO '05 Proceedings of the 7th annual workshop on Genetic and evolutionary computation
Hybrid least-squares methods for reinforcement learning

IEA/AIE'2003 Proceedings of the 16th international conference on Developments in applied artificial intelligence
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning

Discrete Event Dynamic Systems
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Performance Loss Bounds for Approximate Value Iteration with State Aggregation

Mathematics of Operations Research
Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension

Evolutionary Computation
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation

Proceedings of the 24th international conference on Machine learning
Reinforcement learning for a biped robot based on a CPG-actor-critic method

Neural Networks
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Machine Learning
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Proceedings of the 25th international conference on Machine learning
A semiparametric statistical approach to model-free policy evaluation

Proceedings of the 25th international conference on Machine learning
Preconditioned temporal difference learning

Proceedings of the 25th international conference on Machine learning
Sigma point policy iteration

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Approximating Arbitrary Reinforcement Signal by Learning Classifier Systems using Micro Genetic Algorithm

Fundamenta Informaticae
Policy Iteration for Learning an Exercise Policy for American Options

Recent Advances in Reinforcement Learning
Projected equation methods for approximate solution of large linear systems

Journal of Computational and Applied Mathematics
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
Regularization and feature selection in least-squares temporal difference learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Fast gradient-descent methods for temporal-difference learning with linear function approximation

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Online exploration in least-squares policy iteration

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Least Squares SVM for Least Squares TD Learning

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
Reinforcement learning for a CPG-driven biped robot

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Incremental least-squares temporal difference learning

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Hybrid least-squares algorithms for approximate policy evaluation

Machine Learning
Optimal Online Learning Procedures for Model-Free Policy Evaluation

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Efficient reinforcement learning using recursive least-squares methods

Journal of Artificial Intelligence Research
Natural actor-critic algorithms

Automatica (Journal of IFAC)
An Additive Reinforcement Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning

Neural Computation
Commentary---Perspectives on Stochastic Optimization Over Time

INFORMS Journal on Computing
A general fuzzified CMAC based reinforcement learning control for ship steering using recursive least-squares algorithm

Neurocomputing
Model-based least-squares policy evaluation

AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Error Bounds for Approximations from Projected Linear Equations

Mathematics of Operations Research
Linear options

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Adaptive bases for reinforcement learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Revisiting natural actor-critics with value function approximation

MDAI'10 Proceedings of the 7th international conference on Modeling decisions for artificial intelligence
Sparse approximate dynamic programming for dialog management

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Kalman temporal differences

Journal of Artificial Intelligence Research
Sample-efficient batch reinforcement learning for dialogue management optimization

ACM Transactions on Speech and Language Processing (TSLP)
Relational preference rules for control

Artificial Intelligence
Generalized TD Learning

The Journal of Machine Learning Research
Value function approximation in zero-sum markov games

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Policy iteration for factored MDPs

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Monte Carlo matrix inversion policy evaluation

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Minimax search and reinforcement learning for adversarial tetris

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming

Mathematics of Operations Research
Actor-Critic algorithm based on incremental least-squares temporal difference with eligibility trace

ICIC'11 Proceedings of the 7th international conference on Advanced Intelligent Computing Theories and Applications: with aspects of artificial intelligence
Letter to the editor: Asymptotic analysis of value prediction by well-specified and misspecified models

Neural Networks
ℓ1-Penalized projected bellman residual

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Regularized least squares temporal difference learning with nested ℓ2 and ℓ1 penalization

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Recursive least-squares learning with eligibility traces

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Value function approximation through sparse bayesian modeling

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Batch, off-policy and model-free apprenticeship learning

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Approximating Arbitrary Reinforcement Signal by Learning Classifier Systems using Micro Genetic Algorithm

Fundamenta Informaticae
Adaptive reservoir computing through evolution and learning

Neurocomputing
Low complexity proto-value function learning from sensory observations with incremental slow feature analysis

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
An efficient L2-norm regularized least-squares temporal difference learning algorithm

Knowledge-Based Systems
Performance bounds for λ policy iteration and application to the game of Tetris

The Journal of Machine Learning Research
Finite-sample analysis of least-squares policy iteration

The Journal of Machine Learning Research
A reinforcement learning approach to autonomous decision-making in smart electricity markets

Machine Learning
Reward shaping for statistical optimisation of dialogue management

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Better generalization with forecasts

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Linear Bayesian reinforcement learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
Construction of approximation spaces for reinforcement learning

The Journal of Machine Learning Research
Dopamine ramps are a consequence of reward prediction errors

Neural Computation

Quantified Score

Hi-index	0.00

Linear least-squares algorithms for temporal difference learning

Quantified Score

Visualization

Abstract