How to find trouble in communication

  • Authors:
  • A. Batliner;K. Fischer;R. Huber;J. Spilker;E. Nöth

  • Affiliations:
  • Lehrstuhl für Mustererkennung (Informatik 5), University of Erlangen-Nuremberg, Martensstrasse 3, 91058 Erlangen, Germany;University of Bremen, Fachbereich 10, Sprach- und Literaturwissensehaften, Postfach 330440, 28334 Bremen, Germany;Lehrstuhl für Mustererkennung (Informatik 5), University of Erlangen-Nuremberg, Martensstrasse 3, 91058 Erlangen, Germany;Lehrstuhl für Mustererkennung (Informatik 5), University of Erlangen-Nuremberg, Martensstrasse 3, 91058 Erlangen, Germany;Lehrstuhl für Mustererkennung (Informatik 5), University of Erlangen-Nuremberg, Martensstrasse 3, 91058 Erlangen, Germany

  • Venue:
  • Speech Communication - Special issue on speech and emotion
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic dialogue systems used, for instance, in call centers, should be able to determine in a critical phase of the dialogue-indicated by the customers vocal expression of anger/irritation-when it is better to pass over to a human operator. At a first glance, this does not seem to be a complicated task: It is reported in the literature that emotions can be told apart quite reliably on the basis of prosodic features. However, these results are achieved most of the time in a laboratory setting, with experienced speakers (actors), and with elicited, controlled speech. We compare classification results obtained with the same feature set for elicited speech and for a Wizard-of-Oz scenario, where users believe that they are really communicating with an automatic dialogue system. It turns out that the closer we get to a realistic scenario, the less reliable is prosody as an indicator of the speakers' emotional state. As a consequence, we propose to change the target such that we cease looking for traces of particular emotions in the users' speech, but instead look for indicators of TROUBLE IN COMMUNICATION. For this reason, we propose the module Monitoring of User State [especially of] Emotion (MOUSE) in which a prosodic classifier is combined with other knowledge sources, such as conversationally peculiar linguistic behavior, for example, the use of repetitions. For this module, preliminary experimental results are reported showing a more adequate modelling of TROUBLE IN COMMUNICATION.