Toward learning and evaluation of dialogue policies with text examples

Authors:
David DeVault;Anton Leuski;Kenji Sagae
Affiliations:
University of Southern California, Playa Vista, CA;University of Southern California, Playa Vista, CA;University of Southern California, Playa Vista, CA
Venue:
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Year:
2011

Citing 7
Cited 1

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
A maximum entropy approach to natural language processing

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A corpus-based approach to help-desk response generation

CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
Virtual Patients for Clinical Therapist Skills Training

IVA '07 Proceedings of the 7th international conference on Intelligent Virtual Agents
Unsupervised modeling of Twitter conversations

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
I've said it before, and I'll say it again: an empirical investigation of the upper bound of the selection approach to dialogue

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue

A study in how NLU performance can affect the choice of dialogue system architecture

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a dialogue collection and enrichment framework that is designed to explore the learning and evaluation of dialogue policies for simple conversational characters using textual training data. To facilitate learning and evaluation, our framework enriches a collection of role-play dialogues with additional training data, including paraphrases of user utterances, and multiple independent judgments by external referees about the best policy response for the character at each point. As a case study, we use this framework to train a policy for a limited domain tactical questioning character, reaching promising performance. We also introduce an automatic policy evaluation metric that recognizes the validity of multiple conversational responses at each point in a dialogue. We use this metric to explore the variability in human opinion about optimal policy decisions, and to automatically evaluate several learned policies in our example domain.