Semi-supervised learning for semantic video retrieval

  • Authors:
  • Ralph Ewerth;Bernd Freisleben

  • Affiliations:
  • University of Marburg, Marburg, Germany;University of Marburg, Marburg, Germany

  • Venue:
  • Proceedings of the 6th ACM international conference on Image and video retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The automatic understanding of audiovisual content for multimedia retrieval is a difficult task, since the meaning respectively the appearance of a certain event or concept is strongly determined by contextual information. For example, the appearance of a high-level concept, such as e.g. maps or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In this paper, we show that it is possible to adaptively learn the appearance of certain objects or events for a particular test video utilizing unlabeled data in order to improve a subsequent retrieval process. First, an initial model is obtained via supervised learning using a set of appropriate training videos. Then, this initial model is used to rank shots for each test video v separately. This ranking is used to label the most relevant and most irrelevant shots in a video v for subsequent use as training data in a semi-supervised learning process. Based on these automatically labeled training data, relevant features are selected for the concept under consideration for video v. Then, two additional classifiers are trained on the automatically labeled data of this video. Adaboost and Support Vector Machines (SVM) are incorporated for feature selection and ensemble classification. Finally, the newly trained classifiers and the initial model form an ensemble. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed learning scheme for certain high-level concepts.