Optimal Threshold Policies for Multivariate Stopping-Time POMDPs

  • Authors:
  • Vikram Krishnamurthy

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, Canada V6T 1Z4

  • Venue:
  • ECSQARU '09 Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with the solving multivariate partially observed Markov decision process (POMDPs). We give sufficient conditions on the cost function, dynamics of the Markov chain target and observation probabilities so that the optimal scheduling policy has a threshold structure with respect to the multivariate TP2 ordering. We present stochastic approximation algorithms to estimate the parameterized threshold policy.