United We Stand: Population Based Methods for Solving Unknown POMDPs

Authors:
Noel Welsh;Jeremy Wyatt
Affiliations:
School of Computer Science, The University of Birmingham, Birmingham, UK B15 2TT;School of Computer Science, The University of Birmingham, Birmingham, UK B15 2TT
Venue:
Recent Advances in Reinforcement Learning
Year:
2008

Citing 6
Cited 0

Evolutionary Search, Stochastic Policies with Memory, and Reinforcement Learning with Hidden State

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Direct Policy Search using Paired Statistical Tests

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning Policies with External Memory

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Algorithms for sequential decision-making

Algorithms for sequential decision-making
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Quantified Score

Hi-index	0.03

Visualization

Abstract

Solving large unknown POMDPs is an open research problem. Policy search is one solution method that is attractive as it scales in the size of the policy, which is typically much simpler than the environment. We present a global search algorithm capable of finding good policies for POMDPs that are substantially larger than previously reported results. Our algorithm is general; we show it can be used with, and improves the performance of, existing local search techniques such as gradient ascent. Sharing information between the members of the population is the key to our algorithm and we show it results in better performance than equivalent parallel searches that do not share information. Unlike previous work our algorithm does not require the size of the policy to be known in advance.