Learning lexical alignment policies for generating referring expressions in spoken dialogue systems

Authors:
Srinivasan Janarthanam;Oliver Lemon
Affiliations:
University of Edinburgh, Edinburgh;University of Edinburgh, Edinburgh
Venue:
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Year:
2009

Citing 8
Cited 11

Improving a human-computer dialogue

Communications of the ACM
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Flexible guidance generation using user model in spoken dialogue systems

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies

The Knowledge Engineering Review
Applying POMDPs to dialog systems in the troubleshooting domain

NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Natural language generation as planning under uncertainty for spoken dialogue systems

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A wizard-of-oz environment to study referring expression generation in a situated spoken dialogue task

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Agenda-based user simulation for bootstrapping a POMDP dialogue system

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

A wizard-of-oz environment to study referring expression generation in a situated spoken dialogue task

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
A two-tier user simulation model for reinforcement learning of adaptive referring expression generation policies

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Learning to adapt to unknown users: referring expression generation in spoken dialogue systems

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Preferences versus adaptation during referring expression generation

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Hierarchical reinforcement learning for adaptive text generation

INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Adaptive referring expression generation in spoken dialogue systems: evaluation with real users

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Computational generation of referring expressions: A survey

Computational Linguistics
Generating subsequent reference in shared visual scenes: computation vs. re-use

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A web-based evaluation framework for spatial instruction-giving systems

ACL '12 Proceedings of the ACL 2012 System Demonstrations
REX-J: Japanese referring expression corpus of situated dialogs

Language Resources and Evaluation
Assessing the influence of personal preferences on the choice of vocabulary for natural language generation

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem that different users have different lexical knowledge about problem domains, so that automated dialogue systems need to adapt their generation choices online to the users' domain knowledge as it encounters them. We approach this problem using policy learning in Markov Decision Processes (MDP). In contrast to related work we propose a new statistical user model which incorporates the lexical knowledge of different users. We evaluate this user model by showing that it allows us to learn dialogue policies that automatically adapt their choice of referring expressions online to different users, and that these policies are significantly better than adaptive hand-coded policies for this problem. The learned policies are consistently between 2 and 8 turns shorter than a range of different hand-coded but adaptive baseline lexical alignment policies.