Affective Audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene Classification

Authors:
G. Irie;T. Satou;A. Kojima;T. Yamasaki;K. Aizawa
Affiliations:
NTT Cyber Solutions Labs., NTT Corp., Yokosuka, Japan;-;-;-;-
Venue:
IEEE Transactions on Multimedia
Year:
2010

Citing 0
Cited 3

Introduction to the special issue on learning from multi-label data

Machine Learning
Learning representations for affective video understanding

Proceedings of the 21st ACM international conference on Multimedia
Human emotion recognition from videos using spatio-temporal and audio features

The Visual Computer: International Journal of Computer Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel method for movie affective scene classification that outputs the emotion (in the form of labels) that the scene is likely to arouse in viewers. Since the affective preferences of users play an important role in movie selection, affective scene classification has the potential to develop more attractive user-centric movie search and browsing applications. Two main issues in designing movie affective scene classification are considered. One is “how to extract features that are strongly related to the viewer's emotions”, and the other is “how to map the extracted features to the emotion categories”. For the former, we propose a method to extract emotion-category-specific audio-visual features named affective audio-visual words (AAVWs). For the latter issue, we propose a classification model named latent topic driving model (LTDM). Assuming that viewers' emotions are dynamically changed by the movie scene sequences, LTDM models emotions as Markovian dynamic systems driven by the sequential stimuli of the movie content. Experiments on 206 movie scenes extracted from 24 movie titles and the corresponding labels of eight emotion categories given by 16 subjects show that our method outperforms conventional approaches in terms of the subject agreement rate.