Optimal Threshold Policies for Multivariate Stopping-Time POMDPs

Authors:
Vikram Krishnamurthy
Affiliations:
Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, Canada V6T 1Z4
Venue:
ECSQARU '09 Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Year:
2009

Citing 6
Cited 0

On the convexity of policy regions in partially observed systems

Operations Research
Some monotonicity results for partially observed Markov decision processes

Operations Research
Introduction to Stochastic Search and Optimization

Introduction to Stochastic Search and Optimization
Partially Observed Markov Decision Process Multiarmed Bandits---Structural Results

Mathematics of Operations Research
Structured Threshold Policies for Dynamic Sensor Scheduling—A Partially Observed Markov Decision Process Approach

IEEE Transactions on Signal Processing
Algorithms for optimal scheduling and management of hidden Markovmodel sensors

IEEE Transactions on Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with the solving multivariate partially observed Markov decision process (POMDPs). We give sufficient conditions on the cost function, dynamics of the Markov chain target and observation probabilities so that the optimal scheduling policy has a threshold structure with respect to the multivariate TP2 ordering. We present stochastic approximation algorithms to estimate the parameterized threshold policy.