On the existence of fixed points for approximate value iteration and temporal-difference learning

Authors:
D. P. de Farias;B. Van Roy
Affiliations:
-;-
Venue:
Journal of Optimization Theory and Applications
Year:
2000

Citing 0
Cited 8

A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning

Discrete Event Dynamic Systems
Performance Loss Bounds for Approximate Value Iteration with State Aggregation

Mathematics of Operations Research
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy

Neurocomputing
Using control theory for analysis of reinforcement learning and optimal policy properties in grid-world problems

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Approximate Dynamic Programming via a Smoothed Linear Program

Operations Research
Dynamic policy programming

The Journal of Machine Learning Research
Policy oscillation is overshooting

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract