Robust planning with (L)RTDP

Authors:
Olivier Buffet;Douglas Aberdeen
Affiliations:
National ICT Australia & The Australian National University;National ICT Australia & The Australian National University
Venue:
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Year:
2005

Citing 7
Cited 4

Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Stochastic Shortest Path Games

SIAM Journal on Control and Optimization
Bounded-parameter Markov decision process

Artificial Intelligence
Controlled Markov set-chains under average criteria

Applied Mathematics and Computation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
An Empirical Evaluation of Interval Estimation for Markov Decision Processes

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence

Planning under risk and Knightian uncertainty

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficient solutions to factored MDPs with imprecise transition probabilities

Artificial Intelligence
Using mathematical programming to solve Factored Markov Decision Processes with Imprecise Probabilities

International Journal of Approximate Reasoning
Solving uncertain markov decision problems: an interval-based method

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Stochastic Shortest Path problems (SSPs), a subclass of Markov Decision Problems (MDPs), can be efficiently dealt with using Real-Time Dynamic Programming (RTDP). Yet, MDP models are often uncertain (obtained through statistics or guessing). The usual approach is robust planning: searching for the best policy under the worst model. This paper shows how RTDP can be made robust in the common case where transition probabilities are known to lie in a given interval.