Average Optimality in Nonhomogeneous Infinite Horizon Markov Decision Processes

Authors:
Allise O. Wachs;Irwin E. Schochetman;Robert L. Smith
Affiliations:
Integral Concepts, Inc., West Bloomfield, Michigan 48325;Mathematics and Statistics, Oakland University, Rochester, Michigan 48309;Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109
Venue:
Mathematics of Operations Research
Year:
2011

Citing 7
Cited 0

A new optimality criterion for nonhomogeneous Markov decision processes

Operations Research
A tie-breaking rule for discrete infinite horizon optimization

Operations Research - Supplement
Rolling horizon procedures in nonhomogeneous Markov decision processes

Operations Research - Supplement to Operations Research: stochastic processes
Existence and Discovery of Average Optimal Solutions in Deterministic Infinite Horizon Optimization

Mathematics of Operations Research
Nonhomogeneous Markov Decision Processes with Borel State Space--The Average Criterion with Nonuniformly Bounded Rewards

Mathematics of Operations Research
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Stochastic Optimal Control: The Discrete-Time Case

Stochastic Optimal Control: The Discrete-Time Case

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a nonhomogeneous stochastic infinite horizon optimization problem whose objective is to minimize the overall average cost per period of an infinite sequence of actions (average optimality). Optimal solutions to such problems will in general be nonstationary. Moreover, a solution that initially makes poor decisions, and then selects wisely thereafter, can be average optimal. However, we seek average optimal solutions with optimal short-term, as well as long-term, behavior. Our approach is to first transform our stochastic problem into one that is deterministic, using the standard device of formulating the problem as one of choosing a sequence of policies, as opposed to actions. Within this deterministic framework, states become probability distributions over the original stochastic states. Then, by weakening the notion of state reachability, and strengthening the notion of efficiency traditionally used in the deterministic framework, we prove that such efficient solutions exist and are average optimal, thus simultaneously exhibiting both optimal long-and short-run behavior. This deterministic view of the property of stochastic ergodicity offers the potential to relax the traditional conditions for average optimality that use coefficients of ergodicity, as well as the opportunity to strengthen the criterion of average optimality through the property of efficiency.