An on-line procedure in discounted infinite-horizon Stochastic optimal control
Journal of Optimization Theory and Applications
Rolling planning horizons: Error bounds for the dynamic lot size model
Mathematics of Operations Research
Aggregation in dynamic programming
Operations Research
A new optimality criterion for nonhomogeneous Markov decision processes
Operations Research
Concepts of forecast and decision horizons: applications to dynamic stochastic optimization problems
Mathematics of Operations Research
Degeneracy in infinite horizon optimization
Mathematical Programming: Series A and B
A tie-breaking rule for discrete infinite horizon optimization
Operations Research - Supplement
Rolling horizon procedures in nonhomogeneous Markov decision processes
Operations Research - Supplement to Operations Research: stochastic processes
Finite dimensional approximation in infinite dimensional mathematical programming
Mathematical Programming: Series A and B
Conditions for the discovery of solution horizons
Mathematical Programming: Series A and B
Adaptive Markov Control Processes
Adaptive Markov Control Processes
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Manufacturing & Service Operations Management
A Shadow Simplex Method for Infinite Linear Programs
Operations Research
Hi-index | 0.00 |
We consider a nonhomogeneous infinite-horizon Markov Decision Process (MDP) problem with multiple optimal first-period policies. We seek an algorithm that, given finite data, delivers an optimal first-period policy. Such an algorithm can thus recursively generate, within a rolling-horizon procedure, an infinite-horizon optimal solution to the original problem. However, it can happen that no such algorithm exists, i.e., the MDP is not well posed. Equivalently, it is impossible to solve the problem with a finite amount of data. Assuming increasing marginal returns in actions (with respect to states) and stochastically increasing state transitions (with respect to actions), we provide an algorithm that is guaranteed to solve the given MDP whenever it is well posed. This algorithm determines, in finite time, a forecast horizon for which an optimal solution delivers an optimal first-period policy. As an application, we solve all well-posed instances of the time-varying version of the classic asset-selling problem.