Affective ranking of movie scenes using physiological signals and content analysis

  • Authors:
  • Mohammad Soleymani;Guillaume Chanel;Joep J.M. Kierkels;Thierry Pun

  • Affiliations:
  • University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland

  • Venue:
  • MS '08 Proceedings of the 2nd ACM workshop on Multimedia semantics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose an approach for affective ranking of movie scenes based on the emotions that are actually felt by spectators. Such a ranking can be used for characterizing the affective, or emotional, content of video clips. The ranking can for instance help determine which video clip from a database elicits, for a given user, the most joy. This in turn will permit video indexing and retrieval based on affective criteria corresponding to a personalized user affective profile. A dataset of 64 different scenes from 8 movies was shown to eight participants. While watching, their physiological responses were recorded; namely, five peripheral physiological signals (GSR - galvanic skin resistance, EMG - electromyograms, blood pressure, respiration pattern, skin temperature) were acquired. After watching each scene, the participants were asked to self-assess their felt arousal and valence for that scene. In addition, movie scenes were analyzed in order to characterize each with various audio- and video-based features capturing the key elements of the events occurring within that scene. Arousal and valence levels were estimated by a linear combination of features from physiological signals, as well as by a linear combination of content-based audio and video features. We show that a correlation exists between arousal- and valence-based rankings provided by the spectator's self-assessments, and rankings obtained automatically from either physiological signals or audio-video features. This demonstrates the ability of using physiological responses of participants to characterize video scenes and to rank them according to their emotional content. This further shows that audio-visual features, either individually or combined, can fairly reliably be used to predict the spectator's felt emotion for a given scene. The results also confirm that participants exhibit different affective responses to movie scenes, which emphasizes the need for the emotional profiles to be user-dependant.