Audiovisual three-level fusion for continuous estimation of Russell's emotion circumplex

  • Authors:
  • Enrique Sánchez-Lozano;Paula Lopez-Otero;Laura Docio-Fernandez;Enrique Argones-Rúa;José Luis Alba-Castro

  • Affiliations:
  • University of Vigo, Vigo, Spain;University of Vigo, Vigo, Spain;University of Vigo, Vigo, Spain;Gradiant, Vigo, Spain;University of Vigo, Vigo, Spain

  • Venue:
  • Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Predicting human emotions is catching the attention of many research areas, which demand accurate predictions in uncontrolled scenarios. Despite this attractiveness, designed systems for emotion detection are far off being as accurate as desired. Two of the typical measurements in human emotions are described in terms of the dimensions valence and arousal, which shape the Russell's circumplex where complex emotions lie. Thus, the Affect Recognition Sub-Challenge (ASC) of the third AudioVisual Emotion and Depression Challenge, AVEC'13, is focused on estimating these two dimensions. This paper presents a three-level fusion system combining single regression results from audio and visual features, in order to maximize the mean average correlation on both dimensions. Five sets of features are extracted (three for audio and two for video), and they are merged following an iterative process. Results show how this fusion outperforms the baseline method for the challenge database.