Point-based online value iteration algorithm in large POMDP

  • Authors:
  • Bo Wu;Hong-Yan Zheng;Yan-Peng Feng

  • Affiliations:
  • Education Technology and Information Center, Shenzhen Polytechnic, Shenzhen, China 518055;Education Technology and Information Center, Shenzhen Polytechnic, Shenzhen, China 518055;Education Technology and Information Center, Shenzhen Polytechnic, Shenzhen, China 518055

  • Venue:
  • Applied Intelligence
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Partially observable Markov decision process (POMDP) is an ideal framework for sequential decision-making under uncertainty in stochastic domains. However, it is notoriously computationally intractable to solving POMDP in real-time system. In order to address this problem, this paper proposes a point-based online value iteration (PBOVI) algorithm which involves performing value backup at specific reachable belief points, rather than over the entire belief simplex, to speed up computation processes, exploits branch-and-bound pruning approach to prune the AND/OR tree of belief states online, and proposes a novel idea to reuse the belief states that have been searched to avoid repeated computation. The experiment and simulation results show that the proposed algorithm can simultaneously satisfy the requirement of low errors and high timeliness in real-time system.