A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Robot Learning From Demonstration
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A POMDP formulation of preference elicitation problems
Eighteenth national conference on Artificial intelligence
Spoken dialogue management using probabilistic reasoning
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
ICML '06 Proceedings of the 23rd international conference on Machine learning
Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs
Proceedings of the 25th international conference on Machine learning
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Model based Bayesian exploration
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Active learning in partially observable markov decision processes
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
Personal robots are becoming increasingly prevalent, which raises a number of interesting issues regarding the design and customization of interfaces to such platforms. The particular problem addressed by this paper is the use of learning methods to improve the quality and effectiveness of human-machine interaction onboard a robotic wheelchair. In support of this, we present a method for learning and adapting probabilistic models with the aid of a human operator. We use a Bayesian reinforcement learning framework, that allows us to mix learning and execution, as well as take advantage of prior information about the world. We address the problems of learning, handling a partially observable environment, and limiting the number of action requests. We demonstrate empirical feasibility of our approach on an interface for an autonomous wheelchair.