Multimodal fusion for video copy detection

  • Authors:
  • Xavier Anguera;Juan Manuel Barrios;Tomasz Adamek;Nuria Oliver

  • Affiliations:
  • Telefonica Research, Barcelona, Spain;University of Chile, Santiago de Chile, Chile;Telefoncia Research, Barcelona, Spain;Telefonica Research, Barcelona, Spain

  • Venue:
  • MM '11 Proceedings of the 19th ACM international conference on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Content-based video copy detection algorithms (CBCD) focus on detecting video segments that are identical or transformed versions of segments in a known video. In recent years some systems have proposed the combination of orthogonal modalities (e.g. derived from audio and video) to improve detection performance, although not always achieving consistent results. In this paper we propose a fusion algorithm that is able to combine as many modalities as available at the decision level. The algorithm is based on the weighted sum of the normalized scores, which are modified depending on how well they rank in each modality. This leads to a virtually parameter-free fusion algorithm. We performed several tests using 2010 TRECVID VCD datasets and obtain up to 46% relative improvement in min-NDCR while also improving the F1 metric on the fused results in comparison to just using the best single modality.