Robust Video Content Analysis via Transductive Learning

  • Authors:
  • Ralph Ewerth;Markus Mühling;Bernd Freisleben

  • Affiliations:
  • University of Marburg, Germany;University of Marburg, Germany;University of Marburg, Germany

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reliable video content analysis is an essential prerequisite for effective video search. An important current research question is how to develop robust video content analysis methods that produce satisfactory results for a large variety of video sources, distribution platforms, genres, and content. The work presented in this article exploits the observation that the appearance of objects and events is often related to a particular video sequence, episode, program, or broadcast. This motivates our idea of considering the content analysis task for a single video or episode as a transductive setting: the final classification model must be optimal for the given video only, and not in general, as expected for inductive learning. For this purpose, the unlabeled video test data have to be used in the learning process. In this article, a transductive learning framework for robust video content analysis based on feature selection and ensemble classification is presented. In contrast to related transductive approaches for video analysis (e.g., for concept detection), the framework is designed in a general manner and not only for a single task. The proposed framework is applied to the following video analysis tasks: shot boundary detection, face recognition, semantic video retrieval, and semantic indexing of computer game sequences. Experimental results for diverse video analysis tasks and large test sets demonstrate that the proposed transductive framework improves the robustness of the underlying state-of-the-art approaches, whereas transductive support vector machines do not solve particular tasks in a satisfactory manner.