An introduction to reinforcement learning theory: value function methods

Authors:
Peter L. Bartlett
Affiliations:
Barnhill Technologies
Venue:
Advanced lectures on machine learning
Year:
2003

Citing 5
Cited 2

TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
A reinforcement learning approach to job-shop scheduling

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Natural Actor-Critic

Neurocomputing
Job control in heterogeneous computing systems

Journal of Computer and Systems Sciences International

Quantified Score

Hi-index	0.00

Visualization

Abstract

These lecture notes are intended to give a tutorial introduction to the formulation and analysis of reinforcement learning problems. In these problems, an agent chooses actions to take in some environment, aiming to maximize a reward function. Many control, scheduling, planning and game-playing tasks can be formulated in this way, as problems of controlling a Markov decision process. We review the classical dynamic programming approaches to finding optimal controllers. For large state spaces, these techniques are impractical. We review methods based on approximate value functions, estimated via simulation. In particular, we discuss the motivation for (and shortcomings of) the TD (λ) algorithm.