On the Correlation of Automatic Audio and Visual Segmentations of Music Videos

  • Authors:
  • O. Gillet;S. Essid;G. Richard

  • Affiliations:
  • GET-Telecom, LTCI-CNRS, Paris;-;-

  • Venue:
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The study of the associations between audio and video content has numerous important applications in the fields of information retrieval and multimedia content authoring. In this work, we focus on music videos which exhibit a broad range of structural and semantic relationships between the music and the video content. To identify such relationships, a two-level automatic structuring of the music and the video is achieved separately. Note onsets are detected from the music signal, along with section changes. The latter is achieved by a novel algorithm which makes use of feature selection and statistical novelty detection approaches based on kernel methods. The video stream is independently segmented to detect changes in motion activity, as well as shot boundaries. Based on this two-level segmentation of both streams, four audio-visual correlation measures are computed. The usefulness of these correlation measures is illustrated by a query by video experiment on a 100 music video database, which also exhibits interesting genre dependencies