Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
An Empirical Evaluation of Interval Estimation for Markov Decision Processes
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
A theoretical analysis of Model-Based Interval Estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
Hi-index | 0.00 |
We give an example from the theory of Markov decision processes which shows that the "optimism in the face of uncertainty" heuristics may fail to make any progress. This is due to the impossibility to falsify a belief that a (transition) probability is larger than 0. Our example shows the utility of Popper's demand of falsifiability of hypotheses in the area of artificial intelligence.