Marginalized multi-layer multi-instance kernel for video concept detection

  • Authors:
  • Zheng-Jun Zha;Tao Mei;Richang Hong;Zhiwei Gu

  • Affiliations:
  • School of Computing, National University of Singapore, Singapore;Microsoft Research Asia, Beijing, China;Google Inc., USA;Hefei University of Technology, China

  • Venue:
  • Signal Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.08

Visualization

Abstract

Video concept detection has been extensively studied in recent years. Most of the existing video concept detection approaches have treated video as a flat data sequence. However, video is essentially a kind of media with hierarchical structure, including multiple layers (e.g., video shot, frame, and region) and multiple instance relationship embedded in each pair of contiguous layers. In this paper, we propose a novel kernel, termed marginalized multi-layer multi-instance (MarMLMI) kernel for video concept detection. Different from most existing methods, the proposed MarMLMI kernel exploits the hierarchical structure of video, i.e., both the multi-layer structure and the multi-instance relationship. Furthermore, the instance label ambiguity in multi-instance setting is addressed by using the technology of marginalized kernel. We perform video concept detection on a real-world video corpus: the TREC video retrieval evaluation (TRECVID) benchmark and compare the proposed MarMLMI kernel to representative existing approaches. The experimental results demonstrate the effectiveness of the proposed MarMLMI kernel.