Does this list contain what you were searching for? Learning adaptive dialogue strategies for interactive question answering

Authors:
V. Rieser;O. Lemon
Affiliations:
School of informatics, university of edinburgh, edinburgh, eh8 9ab, great britain e-mail: vrieser@inf.ed.ac.uk, olemon@inf.ed.ac.uk;School of informatics, university of edinburgh, edinburgh, eh8 9ab, great britain e-mail: vrieser@inf.ed.ac.uk, olemon@inf.ed.ac.uk
Venue:
Natural Language Engineering
Year:
2009

Citing 13
Cited 4

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Separating Skills from Preference: Using Learning to Program by Reward

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Conversation as Action Under Uncertainty

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Towards developing general models of usability with PARADISE

Natural Language Engineering
Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Developing a flexible spoken dialog system using simulation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Using machine learning to explore human multimodal clarification strategies

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning

HLT '02 Proceedings of the second international conference on Human Language Technology Research
A speech-in list-out approach to spoken user interfaces

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Optimizing dialogue management with reinforcement learning: experiments with the NJFun system

Journal of Artificial Intelligence Research
Interactive question answering and constraint relaxation in spoken dialogue systems

SigDIAL '06 Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue
Machine learning for spoken dialogue management: an experiment with speech-based database querying

AIMSA'06 Proceedings of the 12th international conference on Artificial Intelligence: methodology, Systems, and Applications
A probabilistic framework for dialog simulation and optimal strategy learning

IEEE Transactions on Audio, Speech, and Language Processing

Natural language generation as planning under uncertainty for spoken dialogue systems

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Natural language generation as planning under uncertainty for spoken dialogue systems

Empirical methods in natural language generation
Learning and evaluation of dialogue strategies for new applications: Empirical methods for optimization from small data sets

Computational Linguistics
Reinforcement learning of question-answering dialogue policies for virtual museum guides

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Quantified Score

Hi-index	0.01

Visualization

Abstract

Policy learning is an active topic in dialogue systems research, but it has not been explored in relation to interactive question answering (IQA). We take a first step in learning adaptive interaction policies for question answering : we address the question of how to acquire enough reliable query constraints, how many database results to present to the user and when to present them, given the competing trade-offs between the length of the answer list, the length of the interaction, the type of database and the noise in the communication channel. The operating conditions are reflected in an objective function which we use to derive a hand-coded threshold-based policy and rewards to train a reinforcement learning policy. The same objective function is used for evaluation. We show that we can learn strategies for this complex trade-off problem which perform significantly better than a variety of hand-coded policies, for a wide range of noise conditions, user types, types of DB and turn-penalties. Our policy learning framework thus covers a wide spectrum of operating conditions. The learned policies produce an average relative increase in reward of 86.78% over the hand-coded policies. In 93% of the cases the learned policies perform significantly better than the hand-coded ones (p