Efficient solutions to factored MDPs with imprecise transition probabilities

Authors:
Karina Valdivia Delgado;Scott Sanner;Leliane Nunes de Barros
Affiliations:
EACH, Universidade de São Paulo, Av. Arlindo Béttio, 1000 - Ermelino Matarazzo São Paulo - SP, Brazil;NICTA and the Australian National University, Canberra, ACT 2601, Australia;EACH, Universidade de São Paulo, Av. Arlindo Béttio, 1000 - Ermelino Matarazzo São Paulo - SP, Brazil and IME, Universidade de São Paulo, Rua de Matão, 1010 - Cidade Unive ...
Venue:
Artificial Intelligence
Year:
2011

Citing 18
Cited 0

Graph-Based Algorithms for Boolean Function Manipulation

IEEE Transactions on Computers
A model for reasoning about persistence and causation

Computational Intelligence
An analysis of stochastic shortest path problems

Mathematics of Operations Research
Symbolic Boolean manipulation with ordered binary-decision diagrams

ACM Computing Surveys (CSUR)
Algebraic decision diagrams and their applications

ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
Bounded-parameter Markov decision process

Artificial Intelligence
Credal networks

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Optimal learning: computational procedures for bayes-adaptive markov decision processes

Optimal learning: computational procedures for bayes-adaptive markov decision processes
Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Operations Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Planning under risk and Knightian uncertainty

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Robust planning with (L)RTDP

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Affine algebraic decision diagrams (AADDs) and their application to structured probabilistic inference

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Graphical models for imprecise probabilities

International Journal of Approximate Reasoning
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Context-specific independence in Bayesian networks

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Solving uncertain markov decision problems: an interval-based method

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional ''flat'' dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated.