Assessing and improving the performance of speech recognition for incremental systems

Authors:
Timo Baumann;Michaela Atterer;David Schlangen
Affiliations:
Universität Potsdam, Potsdam, Germany;Universität Potsdam, Potsdam, Germany;Universität Potsdam, Potsdam, Germany
Venue:
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2009

Citing 1
Cited 11

An architecture for more realistic conversational systems

Proceedings of the 6th international conference on Intelligent user interfaces

Incremental reference resolution: the task, metrics for evaluation, and a Bayesian filtering model that is sensitive to disfluencies

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
TELIDA: a package for manipulation and visualization of timed linguistic data

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Collaborating on utterances with a spoken dialogue system using an ISU-based approach to incremental dialogue management

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Stability and accuracy in incremental speech recognition

SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Predicting the micro-timing of user input for an incremental spoken dialogue system that completes a user's ongoing turn

SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Voice typing: a new speech interaction model for dictation on touchscreen devices

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Joint satisfaction of syntactic and pragmatic constraints improves incremental spoken language understanding

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
The InproTK 2012 release

SDCTD '12 NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data
A temporal simulator for developing turn-taking methods for spoken dialogue systems

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Integrating incremental speech recognition and POMDP-based dialogue systems

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Situated incremental natural language understanding using Markov Logic Networks

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

In incremental spoken dialogue systems, partial hypotheses about what was said are required even while the utterance is still ongoing. We define measures for evaluating the quality of incremental ASR components with respect to the relative correctness of the partial hypotheses compared to hypotheses that can optimize over the complete input, the timing of hypothesis formation relative to the portion of the input they are about, and hypothesis stability, defined as the number of times they are revised. We show that simple incremental post-processing can improve stability dramatically, at the cost of timeliness (from 90 % of edits of hypotheses being spurious down to 10 % at a lag of 320 ms). The measures are not independent, and we show how system designers can find a desired operating point for their ASR. To our knowledge, we are the first to suggest and examine a variety of measures for assessing incremental ASR and improve performance on this basis.