A New Value Iteration method for the Average Cost Dynamic Programming Problem

Authors:
Dimitri P. Bertsekas
Affiliations:
-
Venue:
SIAM Journal on Control and Optimization
Year:
1998

Citing 0
Cited 3

Policy iteration type algorithms for recurrent state Markov decision processes

Computers and Operations Research
An empirical study of policy convergence in Markov decision process value iteration

Computers and Operations Research
Solving the uncertainty of vertical handovers in multi-radio home networks

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss--Seidel implementation. Computational tests indicate that the Gauss--Seidel version of the new method substantially outperforms the standard method for difficult problems.