Mobile texting: can post-ASR correction solve the issues? an experimental study on gain vs. costs

  • Authors:
  • Michael Feld;Saeedeh Momtazi;Farina Freigang;Dietrich Klakow;Christian Müller

  • Affiliations:
  • German Research Center for Artificial Intelligence, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;German Research Center for Artificial Intelligence, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;German Research Center for Artificial Intelligence, Saarbrücken, Germany

  • Venue:
  • Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The next big step in embedded, mobile speech recognition will be to allow completely free input as it is needed for messaging like SMS or email. However, unconstrained dictation remains error-prone, especially when the environment is noisy. In this paper, we compare different methods for improving a given free-text dictation system used to enter textbased messages in embedded mobile scenarios, where distraction, interaction cost, and hardware limitations enforce strict constraints over traditional scenarios. We present a corpus-based evaluation, measuring the trade-off between improvement of the word error rate versus the interaction steps that are required under various parameters. Results show that by post-processing the output of a "black box" speech recognizer (e.g. a web-based speech recognition service), a reduction of word error rate by 55% (10.3% abs.) can be obtained. For further error reduction, however, a richer representation of the original hypotheses (e.g. lattice) is necessary.