We are not equally negative: fine-grained labeling for multimedia event detection

  • Authors:
  • Zhigang Ma;Yi Yang;Zhongwen Xu;Nicu Sebe;Alexander G. Hauptmann

  • Affiliations:
  • University of Trento, Trento, Italy;Carnegie Mellon University, Pittsburgh, PA, USA;Zhejiang University, Hangzhou, China;University of Trento, Trento, Italy;Carnegie Mellon University, Pittsburgh, PA, USA

  • Venue:
  • Proceedings of the 21st ACM international conference on Multimedia
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multimedia event detection (MED) is an effective technique for video indexing and retrieval. Current classifier training for MED treats the negative videos equally. However, many negative videos may resemble the positive videos in different degrees. Intuitively, we may capture more informative cues from the negative videos if we assign them fine-grained labels, thus benefiting the classifier learning. Aiming for this, we use a statistical method on both the positive and negative examples to get the decisive attributes of a specific event. Based on these decisive attributes, we assign the fine-grained labels to negative examples to treat them differently for more effective exploitation. The resulting fine-grained labels may be not accurate enough to characterize the negative videos. Hence, we propose to jointly optimize the fine-grained labels with the knowledge from the visual features and the attributes representations, which brings mutual reciprocality. Our model obtains two kinds of classifiers, one from the attributes and one from the features, which incorporate the informative cues from the fine-grained labels. The outputs of both classifiers on the testing videos are fused for detection. Extensive experiments on the challenging TRECVID MED 2012 development set have validated the efficacy of our proposed approach.