Video content categorization using the double decomposition

  • Authors:
  • Youtian Du;Feng Chen;Wenli Xu;Xueming Qian

  • Affiliations:
  • Ministry of Education Key Lab for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, China 710049;Department of Automation, Tsinghua University, Beijing, China 100084;Department of Automation, Tsinghua University, Beijing, China 100084;School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China 710049

  • Venue:
  • Multimedia Tools and Applications
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Video contents contain complex structures due to the variety of the components and events involved. For example, surveillance videos often record multi-object interactions and consist of various scales of motion detail; Web videos are composed of multimodal cues, and each cue generally consists of a variety of scales of information. Generally, video contents comprise two types of the combination of the inherent structures: multi-modality/multi-scale and multi-object /multi-scale. Therefore, in this paper, we propose a new framework for video content modeling, under which video contents are decomposed into multiple interacting processes by double decomposition that aims at each type of combination of structures. To model the resulting processes, we propose a method named double-decomposed hidden Markov models (DDHMMs). DDHMMs contain multiple state chains that correspond to the interacting processes. To make the switching frequency of states in each chain consistent with the scale of the corresponding process, a durational state variable is introduced in DDHMMs. The proposed method performs well in modeling the relations among the interacting processes and the dynamics of each. We discuss the appropriate features under the proposed framework and evaluate DDHMMs in two applications, human motion recognition and web video categorization. The experimental results demonstrate that the double decomposition enhances video categorization performance in both cases.