Audio-visual fusion using bayesian model combination for web video retrieval

  • Authors:
  • Vasant Manohar;Stavros Tsakalidis;Pradeep Natarajan;Rohit Prasad;Prem Natarajan

  • Affiliations:
  • Raytheon BBN Technologies, Cambridge, MA, USA;Raytheon BBN Technologies, Cambridge, MA, USA;Raytheon BBN Technologies, Cambridge, MA, USA;Raytheon BBN Technologies, Cambridge, MA, USA;Raytheon BBN Technologies, Cambridge, MA, USA

  • Venue:
  • MM '11 Proceedings of the 19th ACM international conference on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Combining features from multiple, heterogeneous, audio visual sources can significantly improve retrieval performance in consumer domain videos. However, such videos often contain unrelated overlaid audio content, or have significant camera motion to reliably extract visual features. We present an approach, which overcomes errors in individual feature streams by combining classifiers trained on multiple, heterogeneous feature streams using Bayesian model combination (BAYCOM). We demonstrate our method, by combining low-level audio and visual features, for classification of a large 200 hour web video corpus. The combined models outperform any of the individual features by 10%. Further, BAYCOM consistently outperforms traditional early and late fusion methods.