Improving cluster selection and event modeling in unsupervised mining for automatic audiovisual video structuring

  • Authors:
  • Anh-Phuong Ta;Mathieu Ben;Guillaume Gravier

  • Affiliations:
  • INRIA-Rennes, Rennes, Cedex, France;Powedia, Rennes, Cedex, France;CNRS-IRISA, Rennes, Cedex, France

  • Venue:
  • MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Can we discover audio-visually consistent events from videos in a totally unsupervised manner? And, how to mine videos with different genres? In this paper we present our new results in automatically discovering audio-visual events. A new measure is proposed to select audio-visually consistent elements from the two dendrograms respectively representing hierarchical clustering results for the audio and visual modalities. Each selected element corresponds to a candidate event. In order to construct a model for each event, each candidate event is represented as a group of clusters, and a voting mechanism is applied to select training examples for discriminative classifiers. Finally, the trained model is tested on the entire video to select video segments that belong to the event discovered. Experimental results on different and challenging genres of videos, show the effectiveness of our approach.