An Engineering Approach to Optimal Control and Estimation Theory
An Engineering Approach to Optimal Control and Estimation Theory
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Neuro-Dynamic Programming
Bounded Parameter Markov Decision Processes
ECP '97 Proceedings of the 4th European Conference on Planning: Recent Advances in AI Planning
Convex Optimization
Partially observable Markov decision processes with imprecise parameters
Artificial Intelligence
Proceedings of the 24th international conference on Machine learning
Reachability analysis of uncertain systems using bounded-parameter Markov decision processes
Artificial Intelligence
Logic synthesis for reducing leakage power consumption under workload uncertainty
ICC'08 Proceedings of the 12th WSEAS international conference on Circuits
Online Markov Decision Processes
Mathematics of Operations Research
Imprecise markov chains and their limit behavior
Probability in the Engineering and Informational Sciences
Discrete time Markov chains with interval probabilities
International Journal of Approximate Reasoning
Online learning in Markov decision processes with arbitrarily changing rewards and transitions
GameNets'09 Proceedings of the First ICST international conference on Game Theory for Networks
Robust adaptive Markov decision processes in multi-vehicle applications
ACC'09 Proceedings of the 2009 conference on American Control Conference
Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
Operations Research
Bounded parameter Markov decision processes with average reward criterion
COLT'07 Proceedings of the 20th annual conference on Learning theory
Discounted Robust Stochastic Games and an Application to Queueing Control
Operations Research
Efficient solutions to factored MDPs with imprecise transition probabilities
Artificial Intelligence
International Journal of Approximate Reasoning
Theory and Applications of Robust Optimization
SIAM Review
Robust Adversarial Risk Analysis: A Level-k Approach
Decision Analysis
Distributionally Robust Markov Decision Processes
Mathematics of Operations Research
Probabilistic goal Markov decision processes
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Robust online optimization of reward-uncertain MDPs
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A dynamic programming approach to adjustable robust optimization
Operations Research Letters
Optimal Structural Policies for Ambiguity and Risk Averse Inventory and Pricing Models
SIAM Journal on Control and Optimization
A framework for computing bounds for the return of a policy
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
On the complexity of model checking interval-valued discrete time Markov chains
Information Processing Letters
Robust Markov Decision Processes
Mathematics of Operations Research
Robust Modified Policy Iteration
INFORMS Journal on Computing
Polynomial-Time verification of PCTL properties of MDPs with convex uncertainties
CAV'13 Proceedings of the 25th international conference on Computer Aided Verification
Journal of Intelligent and Robotic Systems
Hi-index | 0.00 |
Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets. We show that perfect duality holds for this problem, and that as a consequence, it can be solved with a variant of the classical dynamic programming algorithm, the "robust dynamic programming" algorithm. We show that a particular choice of the uncertainty sets, involving likelihood regions or entropy bounds, leads to both a statistically accurate representation of uncertainty, and a complexity of the robust recursion that is almost the same as that of the classical recursion. Hence, robustness can be added at practically no extra computing cost. We derive similar results for other uncertainty sets, including one with a finite number of possible values for the transition matrices. We describe in a practical path planning example the benefits of using a robust strategy instead of the classical optimal strategy; even if the uncertainty level is only crudely guessed, the robust strategy yields a much better worst-case expected travel time.