A New Learning Algorithm for the Fusion of Adaptive Audio---Visual Features for the Retrieval and Classification of Movie Clips

  • Authors:
  • Paisarn Muneesawang;Ling Guan;Tahir Amin

  • Affiliations:
  • Department of Electrical and Computer Engineering, Naresuan University, Phitsanulok, Thailand 65000;Department of Electrical and Computer Engineering, Ryerson University, Toronto, Canada M5B 2K3;Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada

  • Venue:
  • Journal of Signal Processing Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new learning algorithm for audiovisual fusion and demonstrates its application to video classification for film database. The proposed system utilized perceptual features for content characterization of movie clips. These features are extracted from different modalities and fused through a machine learning process. More specifically, in order to capture the spatio-temporal information, an adaptive video indexing is adopted to extract visual feature, and the statistical model based on Laplacian mixture are utilized to extract audio feature. These features are fused at the late fusion stage and input to a support vector machine (SVM) to learn semantic concepts from a given video database. Based on our experimental results, the proposed system implementing the SVM-based fusion technique achieves high classification accuracy when applied to a large volume database containing Hollywood movies.