Training a real-world POMDP-based dialogue system

Authors:
Blaise Thomson;Jost Schatzmann;Karl Weilhammer;Hui Ye;Steve Young
Affiliations:
Cambridge University, Cambridge, United Kingdom;Cambridge University, Cambridge, United Kingdom;Cambridge University, Cambridge, United Kingdom;Cambridge University, Cambridge, United Kingdom;Cambridge University, Cambridge, United Kingdom
Venue:
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Year:
2007

Citing 7
Cited 8

Planning and acting in partially observable stochastic domains

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
An epsilon-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies

The Knowledge Engineering Review
Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Evaluating user simulations with the Cramér-von Mises divergence

Speech Communication
Using automatically transcribed dialogs to learn user models in a spoken dialog system

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
The hidden information state dialogue manager: a real-world POMDP-based system

NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Training and evaluation of the HIS POMDP dialogue system in noise

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Review:

The Knowledge Engineering Review
Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems

Computer Speech and Language
Bringing together commercial and academic perspectives for the development of intelligent AmI interfaces

Journal of Ambient Intelligence and Smart Environments - A software engineering perspective on smart applications for AmI
A domain-independent statistical methodology for dialog management in spoken dialog systems

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially Observable Markov Decision Processes provide a principled way to model uncertainty in dialogues. However, traditional algorithms for optimising policies are intractable except for cases with very few states. This paper discusses a new approach to policy optimisation based on grid-based Q-learning with a summary of belief space. We also present a technique for bootstrapping the system using a novel agenda-based user model. An implementation of a policy trained using this system was tested with human subjects in an extensive trial. The policy gave highly competitive results, with a 90.6% task completion rate.