Diversity-based inference of finite automata
Journal of the ACM (JACM)
Efficient dynamic-programming updates in partially observable Markov decision processes
Efficient dynamic-programming updates in partially observable Markov decision processes
Looping suffix tree-based inference of partially observable hidden state
ICML '06 Proceedings of the 23rd international conference on Machine learning
Hi-index | 0.00 |
In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observations and actions. A novel and sound technique for learning a Q-function on history spaces is developed and discussed. We analyze certain conditions under which a history based approach is able to learn policies comparable to the optimal solution on belief states. The algorithm presented is model-free and can be combined with any method learning history spaces. We also present a procedure able to learn history spaces especially suited for our Q-learning algorithm.