User simulations for context-sensitive speech recognition in spoken dialogue systems

Authors:
Oliver Lemon;loannis Konstas
Affiliations:
Edinburgh University;University of Glasgow
Venue:
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Year:
2009

Citing 6
Cited 5

Using Natural Language Processing and discourse Features to Identify Understanding Errors

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Predicting automatic speech recognition performance using prosodic cues

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Classifying recognition results for spoken dialog systems

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies

The Knowledge Engineering Review
Combining acoustic and pragmatic features to predict recognition performance in spoken dialogue systems

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Agenda-based user simulation for bootstrapping a POMDP dialogue system

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

Reading between the lines: learning to map high-level instructions to commands

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Introduction to special issue on machine learning for adaptivity in spoken dialogue systems

ACM Transactions on Speech and Language Processing (TSLP)
Learning and evaluation of dialogue strategies for new applications: Empirical methods for optimization from small data sets

Computational Linguistics
Estimating a user’s internal state before the first input utterance

Advances in Human-Computer Interaction
Modeling user behavior online for disambiguating user input in a spoken dialogue system

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use a machine learner trained on a combination of acoustic and contextual features to predict the accuracy of incoming n-best automatic speech recognition (ASR) hypotheses to a spoken dialogue system (SDS). Our novel approach is to use a simple statistical User Simulation (US) for this task, which measures the likelihood that the user would say each hypothesis in the current context. Such US models are now common in machine learning approaches to SDS, are trained on real dialogue data, and are related to theories of "alignment" in psycholinguistics. We use a US to predict the user's next dialogue move and thereby re-rank n-best hypotheses of a speech recognizer for a corpus of 2564 user utterances. The method achieved a significant relative reduction of Word Error Rate (WER) of 5% (this is 44% of the possible WER improvement on this data), and 62% of the possible semantic improvement (Dialogue Move Accuracy), compared to the baseline policy of selecting the topmost ASR hypothesis. The majority of the improvement is attributable to the User Simulation feature, as shown by Information Gain analysis.