Towards developing general models of usability with PARADISE
Natural Language Engineering
Learning optimal dialogue strategies: a case study of a spoken dialogue agent for email
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Spoken dialogue management using probabilistic reasoning
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Learning more effective dialogue strategies using limited dialogue move features
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
Fast reinforcement learning of dialog strategies
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Applying POMDPs to dialog systems in the troubleshooting domain
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Training a real-world POMDP-based dialogue system
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Optimizing dialogue management with reinforcement learning: experiments with the NJFun system
Journal of Artificial Intelligence Research
Scaling POMDPs for Spoken Dialog Management
IEEE Transactions on Audio, Speech, and Language Processing
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Parameter estimation for agenda-based user simulation
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Assessing user simulation for dialog systems using human judges and automatic evaluation measures
Natural Language Engineering
Generative goal-driven user simulation for dialog management
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
User simulations are increasingly employed in the development and evaluation of spoken dialog systems. However, there is no accepted method for evaluating user simulations, which is problematic because the performance of new dialog management techniques is often evaluated on user simulations alone, not on real people. In this paper, we propose a novel method of evaluating user simulations. We view a user simulation as a predictor of the performance of a dialog system, where per-dialog performance is measured with a domain-specific scoring function. The divergence between the distribution of dialog scores in the real and simulated corpora provides a measure of the quality of the user simulation, and we argue that the Cramer-von Mises divergence is well-suited to this task. To demonstrate this technique, we study a corpus of callers with real information needs and show that Cramer-von Mises divergence conforms to expectations. Finally, we present simple tools which enable practitioners to interpret the statistical significance of comparisons between user simulations.