Characterizing and recognizing spoken corrections in human-computer dialogue

  • Authors:
  • Gina-Anne Levow

  • Affiliations:
  • MIT Al Laboratory, Cambridge, MA

  • Venue:
  • COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Miscommunication in speech recognition systems is unavoidable, but a detailed characterization of user corrections will enable speech systems to identify when a correction is taking place and to more accurately recognize the content of correction utterances. In this paper we investigate the adaptations of users when they encounter recognition errors in interactions with a voice-in/voice-out spoken language system. In analyzing more than 300 pairs of original and repeat correction utterances, matched on speaker and lexical content, we found overall increases in both utterance and pause duration from original to correction. Interestingly, corrections of misrecognition erros (CME) exhibited significantly heightened pitch variability, while corrections of rejection errors (CRE) showed only a small but significant decrease in pitch minimum. CME's demonstrated much greater increases in measures of duration and pitch variability than CRE's. These contrasts allow the development of decision trees which distinguish CME's from CRE's and from original inputs at 70--75% accuracy based on duration, pitch, and amplitude features.