Stability and accuracy in incremental speech recognition

Authors:
Ethan O. Selfridge;Iker Arizmendi;Peter A. Heeman;Jason D. Williams
Affiliations:
Oregon Health & Science University, Portland, OR;AT&T Labs -- Research, Shannon Laboratory, Florham Park, NJ;Oregon Health & Science University, Portland, OR;AT&T Labs -- Research, Shannon Laboratory, Florham Park, NJ
Venue:
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Year:
2011

Citing 6
Cited 4

Incremental dialogue processing in a micro-domain

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Assessing and improving the performance of speech recognition for incremental systems

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Can I finish?: learning when to respond to incremental interpretation results in interactive dialogue

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Incremental reference resolution: the task, metrics for evaluation, and a Bayesian filtering model that is sensitive to disfluencies

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Estimating probability of correctness for ASR N-best lists

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Importance-Driven Turn-Bidding for spoken dialogue systems

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Voice typing: a new speech interaction model for dictation on touchscreen devices

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Incremental spoken dialogue systems: tools and data

SDCTD '12 NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data
Optimising incremental dialogue decisions using information density for interactive systems

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Integrating incremental speech recognition and POMDP-based dialogue systems

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

Conventional speech recognition approaches usually wait until the user has finished talking before returning a recognition hypothesis. This results in spoken dialogue systems that are unable to react while the user is still speaking. Incremental Speech Recognition (ISR), where partial phrase results are returned during user speech, has been used to create more reactive systems. However, ISR output is unstable and so prone to revision as more speech is decoded. This paper tackles the problem of stability in ISR. We first present a method that increases the stability and accuracy of ISR output, without adding delay. Given that some revisions are unavoidable, we next present a pair of methods for predicting the stability and accuracy of ISR results. Taken together, we believe these approaches give ISR more utility for real spoken dialogue systems.