Whence and Whither: The Automatic Recognition of Emotions in Speech (Invited Keynote)

  • Authors:
  • Anton Batliner

  • Affiliations:
  • Lehrstuhl für Mustererkennung, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany D-91058

  • Venue:
  • PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this talk, we first want to sketch the (short) history of the automatic recognition of emotions in speech: studies on the characteristics of emotions in speech were published as early as in the twenties and thirties of the last century; attempts to recognize them automatically began in the mid nineties, dealing with acted data which still are used often - too often if we consider the fact that drawing inferences from acted data onto realistic data is at least sub-optimal.In a second part, we present the necessary `basics': the design of the scenario, the recordings, the manual processing (transliteration, annotation), etc. These basics are to some extent `generic' -- for instance, each speech database has to be transliterated orthographically somehow. Other ones are specific such as the principles and the guidelines for emotion annotation, and the basic choices between, for example, dimensional and categorical approaches. The pros and cons of different annotation approaches have been discussed widely; however, the unit of analysis (utterance, turn, sentence, etc.?) has not yet been dealt with often; thus we will discuss this topic in more detail.In a third part, we will present acoustic and linguistic features that have been used (or should be used) in this field, and touch on the topic of their different degree of relevance.Classification and necessary ingredients such as feature reduction and selection, choice of classifier, and assessment of classification performance, will be addressed in the fourth part.So far, we have been dealing with the `whence' in our title, depicting the state-of-the-art; we will end up the talk with the `whither' in the title -- with promising applications and some speculations on dead end approaches.