Learning curve bounds for a Markov decision process with undiscounted rewards

Authors:
Lawrence K. Saul;Satinder P. Singh
Affiliations:
Center for Biological and Computational Learning, Massachusetts Institute of Technology, 79 Amherst Street, E10-243, Cambridge, MA;Harlequin Inc., One Cambridge Center, Cambridge, MA and Center for Biological and Computational Learning, Massachusetts Institute of Technology, 79 Amherst Street, E10-243, Cambridge, MA
Venue:
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Year:
1996

Technical Note: \cal Q-Learning

Machine Learning
Rigorous learning curve bounds from statistical mechanics

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Efficient reinforcement learning

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Reinforcement learning algorithms for average-payoff Markovian decision processes

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Markov decision processes in large state spaces

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Learning to Predict by the Methods of Temporal Differences

Machine Learning

Hi-index	0.00