Policy iteration type algorithms for recurrent state Markov decision processes
Computers and Operations Research
An empirical study of policy convergence in Markov decision process value iteration
Computers and Operations Research
Solving the uncertainty of vertical handovers in multi-radio home networks
Computer Communications
Hi-index | 0.00 |
We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss--Seidel implementation. Computational tests indicate that the Gauss--Seidel version of the new method substantially outperforms the standard method for difficult problems.