Improving a human-computer dialogue
Communications of the ACM
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Generating referring expressions: boolean extensions of the incremental algorithm
Computational Linguistics
Separating Skills from Preference: Using Learning to Program by Reward
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Graph-based generation of referring expressions
Computational Linguistics
Tailoring lexical choice to the user's vocabulary in multimedia explanation generation
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Cooking up referring expressions
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
User Modeling in Spoken Dialogue Systems to Generate Flexible Guidance
User Modeling and User-Adapted Interaction
Flexible guidance generation using user model in spoken dialogue systems
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Knowledge Engineering Review
Generating referring expressions in open domains
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Applying POMDPs to dialog systems in the troubleshooting domain
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Natural language generation as planning under uncertainty for spoken dialogue systems
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Generation of repeated references to discourse entities
ENLG '07 Proceedings of the Eleventh European Workshop on Natural Language Generation
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
What game theory can do for NLG: the case of vague language
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Agenda-based user simulation for bootstrapping a POMDP dialogue system
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Noun phrase generation for situated dialogs
INLG '06 Proceedings of the Fourth International Natural Language Generation Conference
Attribute selection for referring expression generation: new algorithms and evaluation methods
INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Modelling and evaluation of lexical and syntactic alignment with a priming-based microplanner
Empirical methods in natural language generation
Natural language generation as planning under uncertainty for spoken dialogue systems
Empirical methods in natural language generation
Modelling and evaluation of lexical and syntactic alignment with a priming-based microplanner
Empirical methods in natural language generation
Natural language generation as planning under uncertainty for spoken dialogue systems
Empirical methods in natural language generation
Hi-index | 0.00 |
We address the problem that different users have different lexical knowledge about problem domains, so that automated dialogue systems need to adapt their generation choices online to the users' domain knowledge as it encounters them. We approach this problem using Reinforcement Learning in Markov Decision Processes (MDP). We present a reinforcement learning framework to learn adaptive referring expression generation (REG) policies that can adapt dynamically to users with different domain knowledge levels. In contrast to related work we also propose a new statistical user model which incorporates the lexical knowledge of different users. We evaluate this framework by showing that it allows us to learn dialogue policies that automatically adapt their choice of referring expressions online to different users, and that these policies are significantly better than hand-coded adaptive policies for this problem. The learned policies are consistently between 2 and 8 turns shorter than a range of different hand-coded but adaptive baseline REG policies.