The convergence of value iteration in average cost Markov decision chains

Authors:
Linn I. Sennott
Affiliations:
Department of Mathematics 4520, Illinois State University, Normal, IL 61790-4520, USA
Venue:
Operations Research Letters
Year:
1996

Citing 7
Cited 2

Control of Markov chains with long-run average cost criterion: the dynamic programming equations

SIAM Journal on Control and Optimization
Discrete-time controlled Markov processes with average cost criterion: a survey

SIAM Journal on Control and Optimization
Linear Programming and Average Optimality of Markov Control Processes on Borel Spaces---Unbounded Costs

SIAM Journal on Control and Optimization
Another set of conditions for average optimality in Markov control processes

Systems & Control Letters
Introduction to Stochastic Dynamic Programming: Probability and Mathematical

Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Comparing recent assumptions for the existence of average optimal stationary policies

Operations Research Letters
On strong average optimality of markov decision processes with unbounded costs

Operations Research Letters

Value iteration and optimization of multiclass queueing networks

Queueing Systems: Theory and Applications
Serial Agile Production Systems with Automation

Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Let J be the (constant) minimum long-run expected average cost in a Markov decision chain with countable state space. We desire the existence of an average cost optimal stationary policy and, in addition, that J is the limit of v"n(.)/n, where v"n(.) is the minimum n-step expected cost. Three sets of sufficient conditions for this to hold are given. The results generalize Ghosh and Marcus (1992).