Scenarios and policy aggregation in optimization under uncertainty
Mathematics of Operations Research
Neuro-Dynamic Programming
Adaptive Inventory Control for Nonstationary Demand and Partial Information
Management Science
The Linear Programming Approach to Approximate Dynamic Programming
Operations Research
Pricing American Options: A Duality Approach
Operations Research
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
Pathwise Stochastic Optimal Control
SIAM Journal on Control and Optimization
Relaxations of Weakly Coupled Stochastic Dynamic Programs
Operations Research
Adaptive modulation with smoothed flow utility
EURASIP Journal on Wireless Communications and Networking
Valuation of Storage at a Liquefied Natural Gas Terminal
Operations Research
Integrated Optimization of Procurement, Processing, and Trade of Commodities
Operations Research
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Pathwise Optimization for Optimal Stopping Problems
Management Science
Pricing American options under partial observation of stochastic volatility
Proceedings of the Winter Simulation Conference
Hi-index | 0.00 |
We describe a general technique for determining upper bounds on maximal values (or lower bounds on minimal costs) in stochastic dynamic programs. In this approach, we relax the nonanticipativity constraints that require decisions to depend only on the information available at the time a decision is made and impose a “penalty” that punishes violations of nonanticipativity. In applications, the hope is that this relaxed version of the problem will be simpler to solve than the original dynamic program. The upper bounds provided by this dual approach complement lower bounds on values that may be found by simulating with heuristic policies. We describe the theory underlying this dual approach and establish weak duality, strong duality, and complementary slackness results that are analogous to the duality results of linear programming. We also study properties of good penalties. Finally, we demonstrate the use of this dual approach in an adaptive inventory control problem with an unknown and changing demand distribution and in valuing options with stochastic volatilities and interest rates. These are complex problems of significant practical interest that are quite difficult to solve to optimality. In these examples, our dual approach requires relatively little additional computation and leads to tight bounds on the optimal values.