Isomorph-free branch and bound search for finite state controllers

Authors:
Marek Grześ;Pascal Poupart;Jesse Hoey
Affiliations:
Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada;Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada;Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 9
Cited 0

An improved policy iteration algorithm for partially observable MDPs

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
An n log n algorithm for minimizing states in a finite automaton

An n log n algorithm for minimizing states in a finite automaton
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Solving POMDPs with continuous or large discrete observation spaces

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
Analyzing and escaping local optima in planning as inference for partially observable domains

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
On the Computational Complexity of Stochastic Controller Optimization in POMDPs

ACM Transactions on Computation Theory (TOCT)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The recent proliferation of smart-phones and other wearable devices has lead to a surge of new mobile applications. Partially observable Markov decision processes provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. However, given the limited battery life, there is a need to minimize the amount of online computation. This can be achieved by compiling a policy into a finite state controller since there is no need for belief monitoring or online search. In this paper, we propose a new branch and bound technique to search for a good controller. In contrast to many existing algorithms for controllers, our search technique is not subject to local optima. We also show how to reduce the amount of search by avoiding the enumeration of isomorphic controllers and by taking advantage of suitable upper and lower bounds. The approach is demonstrated on several benchmark problems as well as a smart-phone application to assist persons with Alzheimer's to wayfind.