On NoMatchs, NoInputs and BargeIns: do non-acoustic features support anger detection?

  • Authors:
  • Alexander Schmitt;Tobias Heinroth;Jackson Liscombe

  • Affiliations:
  • Ulm University, Germany;Ulm University, Germany;SpeechCycle, Inc., Broadway, New York City

  • Venue:
  • SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most studies on speech-based emotion recognition are based on prosodic and acoustic features, only employing artificial acted corpora where the results cannot be generalized to telephone-based speech applications. In contrast, we present an approach based on utterances from 1,911 calls from a deployed telephone-based speech application, taking advantage of additional dialogue features, NLU features and ASR features that are incorporated into the emotion recognition process. Depending on the task, non-acoustic features add 2.3% in classification accuracy compared to using only acoustic features.