Point-based online value iteration algorithm in large POMDP

Authors:
Bo Wu;Hong-Yan Zheng;Yan-Peng Feng
Affiliations:
Education Technology and Information Center, Shenzhen Polytechnic, Shenzhen, China 518055;Education Technology and Information Center, Shenzhen Polytechnic, Shenzhen, China 518055;Education Technology and Information Center, Shenzhen Polytechnic, Shenzhen, China 518055
Venue:
Applied Intelligence
Year:
2014

Citing 22
Cited 0

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Planning and Control in Artificial Intelligence: A Unifying Perspective

Applied Intelligence
Decision-Theoretic Planning for Autonomous Robotic Surveillance

Applied Intelligence
Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Temporal Relevance in Dynamic Decision Networks with Sparse Evidence

Applied Intelligence
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

ICML '05 Proceedings of the 22nd international conference on Machine learning
Distributed decision-making and task coordination in dynamic, uncertain and real-time multiagent environments

Distributed decision-making and task coordination in dynamic, uncertain and real-time multiagent environments
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Forward search value iteration for POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Adaptive behaviors of reactive mobile robot with Bayesian inference in nonstationary environments

Applied Intelligence
Efficient planning under uncertainty with macro-actions

Journal of Artificial Intelligence Research
Approximate planning for factored POMDPs using belief state simplification

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Multi-criteria expertness based cooperative Q-learning

Applied Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning

Applied Intelligence
Learning via human feedback in continuous state and action spaces

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially observable Markov decision process (POMDP) is an ideal framework for sequential decision-making under uncertainty in stochastic domains. However, it is notoriously computationally intractable to solving POMDP in real-time system. In order to address this problem, this paper proposes a point-based online value iteration (PBOVI) algorithm which involves performing value backup at specific reachable belief points, rather than over the entire belief simplex, to speed up computation processes, exploits branch-and-bound pruning approach to prune the AND/OR tree of belief states online, and proposes a novel idea to reuse the belief states that have been searched to avoid repeated computation. The experiment and simulation results show that the proposed algorithm can simultaneously satisfy the requirement of low errors and high timeliness in real-time system.