Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Authors:
Hao Zhang
Affiliations:
Marshall School of Business, University of Southern California, Los Angeles, California 90089
Venue:
Operations Research
Year:
2010

Citing 30
Cited 2

The complexity of Markov decision processes

Mathematics of Operations Research
Some monotonicity results for partially observed Markov decision processes

Operations Research
Computationally feasible bounds for partially observed Markov decision processes

Operations Research
A Survey of solution techniques for the partially observed Markov decision process

Annals of Operations Research
On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes

Annals of Operations Research
A survey of algorithmic methods for partially observed Markov decision processes

Annals of Operations Research
Minkowski addition of polytopes: computational complexity and applications to Gro¨bner bases

SIAM Journal on Discrete Mathematics
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Algorithmic geometry

Algorithmic geometry
Solving very large weakly coupled Markov decision processes

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
An improved policy iteration algorithm for partially observable MDPs

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Dynamic Programming and Stochastic Control

Dynamic Programming and Stochastic Control
Conversation as Action Under Uncertainty

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Planning and Acting under Uncertainty: A New Model for Spoken Dialogue System

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
A POMDP formulation of preference elicitation problems

Eighteenth national conference on Artificial intelligence
Experiences with a mobile robotic guide for the elderly

Eighteenth national conference on Artificial intelligence
Adaptive Inventory Control for Nonstationary Demand and Partial Information

Management Science
Region-based incremental pruning for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Introduction to probabilistic automata (Computer science and applied mathematics)

Introduction to probabilistic automata (Computer science and applied mathematics)
An Optimal Lot-Sizing and Offline Inspection Policy in the Case of Nonrigid Demand

Operations Research
Computational Geometry: Algorithms and Applications

Computational Geometry: Algorithms and Applications
On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP

Mathematics of Operations Research
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Call admission control and routing in integrated services networks using neuro-dynamic programming

IEEE Journal on Selected Areas in Communications

The maximum number of faces of the Minkowski sum of two convex polytopes

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Solving an Infinite Horizon Adverse Selection Model Through Finite Policy Graphs

Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel framework for studying partially observable Markov decision processes (POMDPs) with finite state, action, observation sets, and discounted rewards. The new framework is solely based on future-reward vectors associated with future policies, which is more parsimonious than the traditional framework based on belief vectors. It reveals the connection between the POMDP problem and two computational geometry problems, i.e., finding the vertices of a convex hull and finding the Minkowski sum of convex polytopes, which can help solve the POMDP problem more efficiently. The new framework can clarify some existing algorithms over both finite and infinite horizons and shed new light on them. It also facilitates the comparison of POMDPs with respect to their degree of observability, as a useful structural result.