Retailer-Supplier Flexible Commitments Contracts: A Robust Optimization Approach
Manufacturing & Service Operations Management
Proceedings of the 24th international conference on Machine learning
Bayesian robustness in the control of gene regulatory networks
IEEE Transactions on Signal Processing
Robust adaptive Markov decision processes in multi-vehicle applications
ACC'09 Proceedings of the 2009 conference on American Control Conference
Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
Operations Research
Regret-based reward elicitation for Markov decision processes
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Discounted Robust Stochastic Games and an Application to Queueing Control
Operations Research
Theory and Applications of Robust Optimization
SIAM Review
Robust Adversarial Risk Analysis: A Level-k Approach
Decision Analysis
Distributionally Robust Markov Decision Processes
Mathematics of Operations Research
Probabilistic goal Markov decision processes
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Robust online optimization of reward-uncertain MDPs
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A dynamic programming approach to adjustable robust optimization
Operations Research Letters
Optimal Structural Policies for Ambiguity and Risk Averse Inventory and Pricing Models
SIAM Journal on Control and Optimization
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Robust Markov Decision Processes
Mathematics of Operations Research
Robust Modified Policy Iteration
INFORMS Journal on Computing
Hi-index | 0.01 |
In this paper we propose a robust formulation for discrete time dynamic programming (DP). The objective of the robust formulation is to systematically mitigate the sensitivity of the DP optimal policy to ambiguity in the underlying transition probabilities. The ambiguity is modeled by associating a set of conditional measures with each state-action pair. Consequently, in the robust formulation each policy has a set of measures associated with it. We prove that when this set of measures has a certain "rectangularity" property, all of the main results for finite and infinite horizon DP extend to natural robust counterparts. We discuss techniques from Nilim and El Ghaoui [17] for constructing suitable sets of conditional measures that allow one to efficiently solve for the optimal robust policy. We also show that robust DP is equivalent to stochastic zero-sum games with perfect information.