On NoMatchs, NoInputs and BargeIns: do non-acoustic features support anger detection?

Authors:
Alexander Schmitt;Tobias Heinroth;Jackson Liscombe
Affiliations:
Ulm University, Germany;Ulm University, Germany;SpeechCycle, Inc., Broadway, New York City
Venue:
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Year:
2009

Citing 3
Cited 2

YALE: rapid prototyping for complex data mining tasks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting real life anger

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Handling Emotions in Human-Computer Dialogues

Handling Emotions in Human-Computer Dialogues

Facing reality: simulating deployment of anger recognition in IVR systems

IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
Modeling and predicting quality in spoken human-computer interaction

SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most studies on speech-based emotion recognition are based on prosodic and acoustic features, only employing artificial acted corpora where the results cannot be generalized to telephone-based speech applications. In contrast, we present an approach based on utterances from 1,911 calls from a deployed telephone-based speech application, taking advantage of additional dialogue features, NLU features and ASR features that are incorporated into the emotion recognition process. Depending on the task, non-acoustic features add 2.3% in classification accuracy compared to using only acoustic features.