Using Natural Language Processing and discourse Features to Identify Understanding Errors
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Predicting automatic speech recognition performance using prosodic cues
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Classifying recognition results for spoken dialog systems
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
The Knowledge Engineering Review
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Agenda-based user simulation for bootstrapping a POMDP dialogue system
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Reading between the lines: learning to map high-level instructions to commands
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Introduction to special issue on machine learning for adaptivity in spoken dialogue systems
ACM Transactions on Speech and Language Processing (TSLP)
Estimating a user’s internal state before the first input utterance
Advances in Human-Computer Interaction
Hi-index | 0.00 |
We use a machine learner trained on a combination of acoustic and contextual features to predict the accuracy of incoming n-best automatic speech recognition (ASR) hypotheses to a spoken dialogue system (SDS). Our novel approach is to use a simple statistical User Simulation (US) for this task, which measures the likelihood that the user would say each hypothesis in the current context. Such US models are now common in machine learning approaches to SDS, are trained on real dialogue data, and are related to theories of "alignment" in psycholinguistics. We use a US to predict the user's next dialogue move and thereby re-rank n-best hypotheses of a speech recognizer for a corpus of 2564 user utterances. The method achieved a significant relative reduction of Word Error Rate (WER) of 5% (this is 44% of the possible WER improvement on this data), and 62% of the possible semantic improvement (Dialogue Move Accuracy), compared to the baseline policy of selecting the topmost ASR hypothesis. The majority of the improvement is attributable to the User Simulation feature, as shown by Information Gain analysis.