The acousticvisual emotion guassians model for automatic generation of music video

  • Authors:
  • Ju-Chiang Wang;Yi-Hsuan Yang;I-Hong Jhuo;Yen-Yu Lin;Hsin-Min Wang

  • Affiliations:
  • Academia Sinica, Taipei City, Taiwan Roc;Academia Sinica, Taipei City, Taiwan Roc;National Taiwan University, Taipei City, Taiwan Roc;Academia Sinica, Taipei City, Taiwan Roc;Academia Sinica, Taipei City, Taiwan Roc

  • Venue:
  • Proceedings of the 20th ACM international conference on Multimedia
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel content-based system that utilizes the perceived emotion of multimedia content as a bridge to connect music and video. Specifically, we propose a novel machine learning framework, called Acousticvisual Emotion Guassians (AVEG), to jointly learn the tripartite relationship among music, video, and emotion from an emotion-annotated corpus of music videos. For a music piece (or a video sequence), the AVEG model is applied to predict its emotion distribution in a stochastic emotion space from the corresponding low-level acoustic (resp. visual) features. Finally, music and video are matched by measuring the similarity between the two corresponding emotion distributions, based on a distance measure such as KL divergence.