Assessment of dialogue systems by means of a new simulation technique
Speech Communication
PARADISE: a framework for evaluating spoken dialogue agents
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Developing a flexible spoken dialog system using simulation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
Comparing user simulation models for dialog strategy learning
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Agenda-based user simulation for bootstrapping a POMDP dialogue system
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Optimizing dialogue management with reinforcement learning: experiments with the NJFun system
Journal of Artificial Intelligence Research
Analysis of a new simulation approach to dialog system evaluation
Speech Communication
Towards a Flexible User Simulation for Evaluating Spoken Dialogue Systems
INTERACT '09 Proceedings of the 12th IFIP TC 13 International Conference on Human-Computer Interaction: Part II
Modeling user satisfaction with Hidden Markov Model
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
A user model to predict user satisfaction with spoken dialog systems
IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
Human-machine corpus analysis for generation and interaction with spoken dialog systems
KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
Assessing user simulation for dialog systems using human judges and automatic evaluation measures
Natural Language Engineering
Voice interactive classroom: best practices and design strategies
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
We propose to use user simulation for testing during the development of a sophisticated dialog system. While the limited behaviors of the state-of-the-art user simulation may not cover important aspects in the dialog system testing, our proposed approach extends the functionality of the simulation so that it can be used at least for the early stage testing before the system reaches stable performance for evaluation involving human users. The proposed approach includes a set of evaluation measures that can be computed automatically from the interaction logs between the user simulator and the dialog system. We first validate these measures on human user dialogs using user satisfaction scores. We also build a regression model to estimate the user satisfaction scores using these evaluation measures. Then, we apply the evaluation measures on a simulated dialog corpus trained from the real user corpus. We show that the user satisfaction scores estimated from the simulated corpus are not statistically different from the real users' satisfaction scores.