Video content categorization using the double decomposition

Authors:
Youtian Du;Feng Chen;Wenli Xu;Xueming Qian
Affiliations:
Ministry of Education Key Lab for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, China 710049;Department of Automation, Tsinghua University, Beijing, China 100084;Department of Automation, Tsinghua University, Beijing, China 100084;School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China 710049
Venue:
Multimedia Tools and Applications
Year:
2013

Citing 21
Cited 0

Factorial Hidden Markov Models

Machine Learning - Special issue on learning with probabilistic representations
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Dynamic bayesian networks: representation, inference and learning

Dynamic bayesian networks: representation, inference and learning
Optimal multimodal fusion for multimedia data analysis

Proceedings of the 12th annual ACM international conference on Multimedia
Layered representations for learning and inferring office activity from multiple sensory channels

Computer Vision and Image Understanding - Special issue on event detection in video
Activity Recognition and Abnormality Detection with the Switching Hidden Semi-Markov Model

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Video genre classification using dynamics

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03
Coupled Hidden Semi Markov Models for Activity Recognition

WMVC '07 Proceedings of the IEEE Workshop on Motion and Video Computing
Optimizing multi-graph learning: towards a unified video annotation scheme

Proceedings of the 15th international conference on Multimedia
Multi-modality video shot clustering with tensor representation

Multimedia Tools and Applications
Multi-agent activity recognition using observation decomposedhidden Markov models

Image and Vision Computing
Action and gait recognition from recovered 3-D human joints

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on gait analysis
Modeling temporal structure of decomposable motion segments for activity classification

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Gait recognition based on improved dynamic Bayesian networks

Pattern Recognition
Audio-visual fusion using bayesian model combination for web video retrieval

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Semantic analysis of soccer video using dynamic Bayesian network

IEEE Transactions on Multimedia
Automatic Video Classification: A Survey of the Literature

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.00

Visualization

Abstract

Video contents contain complex structures due to the variety of the components and events involved. For example, surveillance videos often record multi-object interactions and consist of various scales of motion detail; Web videos are composed of multimodal cues, and each cue generally consists of a variety of scales of information. Generally, video contents comprise two types of the combination of the inherent structures: multi-modality/multi-scale and multi-object /multi-scale. Therefore, in this paper, we propose a new framework for video content modeling, under which video contents are decomposed into multiple interacting processes by double decomposition that aims at each type of combination of structures. To model the resulting processes, we propose a method named double-decomposed hidden Markov models (DDHMMs). DDHMMs contain multiple state chains that correspond to the interacting processes. To make the switching frequency of states in each chain consistent with the scale of the corresponding process, a durational state variable is introduced in DDHMMs. The proposed method performs well in modeling the relations among the interacting processes and the dynamics of each. We discuss the appropriate features under the proposed framework and evaluate DDHMMs in two applications, human motion recognition and web video categorization. The experimental results demonstrate that the double decomposition enhances video categorization performance in both cases.