Average Optimality in Nonhomogeneous Infinite Horizon Markov Decision Processes

  • Authors:
  • Allise O. Wachs;Irwin E. Schochetman;Robert L. Smith

  • Affiliations:
  • Integral Concepts, Inc., West Bloomfield, Michigan 48325;Mathematics and Statistics, Oakland University, Rochester, Michigan 48309;Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109

  • Venue:
  • Mathematics of Operations Research
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider a nonhomogeneous stochastic infinite horizon optimization problem whose objective is to minimize the overall average cost per period of an infinite sequence of actions (average optimality). Optimal solutions to such problems will in general be nonstationary. Moreover, a solution that initially makes poor decisions, and then selects wisely thereafter, can be average optimal. However, we seek average optimal solutions with optimal short-term, as well as long-term, behavior. Our approach is to first transform our stochastic problem into one that is deterministic, using the standard device of formulating the problem as one of choosing a sequence of policies, as opposed to actions. Within this deterministic framework, states become probability distributions over the original stochastic states. Then, by weakening the notion of state reachability, and strengthening the notion of efficiency traditionally used in the deterministic framework, we prove that such efficient solutions exist and are average optimal, thus simultaneously exhibiting both optimal long-and short-run behavior. This deterministic view of the property of stochastic ergodicity offers the potential to relax the traditional conditions for average optimality that use coefficients of ergodicity, as well as the opportunity to strengthen the criterion of average optimality through the property of efficiency.