Discretized approximations for POMDP with average cost

Authors:
Huizhen Yu;Dimitri P. Bertsekas
Affiliations:
Lab for Information and Decisions, Cambridge, MA;Lab for Information and Decisions, Cambridge, MA
Venue:
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Year:
2004

Citing 7
Cited 1

Computationally feasible bounds for partially observed Markov decision processes

Operations Research
On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes

Annals of Operations Research
Discrete-time controlled Markov processes with average cost criterion: a survey

SIAM Journal on Control and Optimization
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
An improved grid-based approximation algorithm for POMDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1

Partially Observable Markov Decision Process Approximations for Adaptive Sensing

Discrete Event Dynamic Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a finite number of belief points, and can be computed efficiently using value iteration algorithms for finite-state MDP. While for discounted problems several lower approximation schemes have been proposed earlier, ours seems the first of its kind for average cost problems. We focus primarily on the average cost case, and we show that the corresponding approximation can be computed efficiently using multi-chain algorithms for finite-state MDP. We give a preliminary analysis showing that regardless of the existence of the optimal average cost J* in the POMDP, the approximation obtained is a lower bound of the liminf optimal average cost function, and can also be used to calculate an upper bound on the limsup optimal average cost function, as well as bounds on the cost of executing the stationary policy associated with the approximation. We show the convergence of the cost approximation, when the optimal average cost is constant and the optimal differential cost is continuous.