The complexity of Markov decision processes
Mathematics of Operations Research
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Spoken dialogue management using probabilistic reasoning
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Who's asking for help?: a Bayesian approach to intelligent assistance
Proceedings of the 11th international conference on Intelligent user interfaces
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
Hi-index | 0.00 |
Recent research has addressed the problem of formulating a dialog agent as a partially observable Markov decision process (POMDP), and learning a dialog policy that is optimal given the particular characteristics of the transition, observation and reward functions of the POMDP. This paper addresses the problem of trying to learn a small set of dialog agent policies that provide near-optimal behavior over a wide range of variations in POMDPs, reflecting different user preferences and environment characteristics. We show for a very simple dialog, we can cover a large number of simulated users to within 10% of their optimal return using fewer than 5% of the individual optimal policies.