Exploring multi-modality structure for cross domain adaptation in video concept annotation

  • Authors:
  • Shaoxi Xu;Sheng Tang;Yongdong Zhang;Jintao Li;Yan-Tao Zheng

  • Affiliations:
  • Institute of Computing Technology, Chinese Academy of Sciences, 617H, No. 6 Kexueyuan South Road, Zhongguancun, Haidian District, Beijing 100190, PR China and Graduate University of Chinese Academ ...;Institute of Computing Technology, Chinese Academy of Sciences, 617H, No. 6 Kexueyuan South Road, Zhongguancun, Haidian District, Beijing 100190, PR China;Institute of Computing Technology, Chinese Academy of Sciences, 617H, No. 6 Kexueyuan South Road, Zhongguancun, Haidian District, Beijing 100190, PR China;Institute of Computing Technology, Chinese Academy of Sciences, 617H, No. 6 Kexueyuan South Road, Zhongguancun, Haidian District, Beijing 100190, PR China;Institute for Infocomm Research (I2R), 138632. Singapore, Singapore

  • Venue:
  • Neurocomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

Domain adaptive video concept detection and annotation has recently received significant attention, but in existing video adaptation processes, all the features are treated as one modality, while multi-modalities, the unique and important property of video data, is typically ignored. To fill this blank, we propose a novel approach, named multi-modality transfer based on multi-graph optimization (MMT-MGO) in this paper, which leverages multi-modality knowledge generalized by auxiliary classifiers in the source domains to assist multi-graph optimization (a graph-based semi-supervised learning method) in the target domain for video concept annotation. To our best knowledge, it is the first time to introduce multi-modality transfer into the field of domain adaptive video concept detection and annotation. Moreover, we propose an efficient incremental extension scheme to sequentially estimate a small batch of new emerging data without modifying the structure of multi-graph scheme. The proposed scheme can achieve a comparable accuracy with that of brand-new round optimization which combines these new data with the data corpus for the nearest round optimization, while the time for estimation has been reduced greatly. Extensive experiments over TRECVID2005-2007 data sets demonstrate the effectiveness of both the multi-modality transfer scheme and the incremental extension scheme.