Correlative multilabel video annotation with temporal kernels

Authors:
Guo-Jun Qi;Xian-Sheng Hua;Yong Rui;Jinhui Tang;Tao Mei;Meng Wang;Hong-Jiang Zhang
Affiliations:
University of Science and Technology of China, Anhui, China;Microsoft Corporation, Beijing, China;Microsoft Corporation, Beijing, China;University of Science and Technology of China, Anhui, China;Microsoft Corporation, Beijing, China;University of Science and Technology of China, Anhui, China;Microsoft Corporation, Beijing, China
Venue:
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Year:
2008

Citing 22
Cited 13

Elements of information theory

Elements of information theory
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Entropy Measures,Maximum Entropy Principle and Emerging Applications

Entropy Measures,Maximum Entropy Principle and Emerging Applications
Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Convex Optimization

Convex Optimization
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Semantic Event Detection using Conditional Random Fields

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Multimedia semantic indexing using model vectors

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Large-Scale Concept Ontology for Multimedia

IEEE MultiMedia
The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction (Stochastic Modelling and Applied Probability)

Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction (Stochastic Modelling and Applied Probability)
Towards optimal bag-of-features for object categorization and semantic video retrieval

Proceedings of the 6th ACM international conference on Image and video retrieval
Video diver: generic video indexing with diverse features

Proceedings of the international workshop on Workshop on multimedia information retrieval
Refining video annotation by exploiting pairwise concurrent relation

Proceedings of the 15th international conference on Multimedia
Structure-sensitive manifold ranking for video concept detection

Proceedings of the 15th international conference on Multimedia
Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News

IEEE Transactions on Multimedia
Measuring Concept Similarities in Multimedia Ontologies: Analysis and Evaluations

IEEE Transactions on Multimedia
Factor graph framework for semantic video indexing

IEEE Transactions on Circuits and Systems for Video Technology

Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
Correlative linear neighborhood propagation for video annotation

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Refining video annotation by exploiting inter-shot context

Proceedings of the international conference on Multimedia
A feature sequence kernel for video concept classification

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Mining concept relationship in temporal context for effective video annotation

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Ensemble multi-instance multi-label learning approach for video annotation task

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Ensemble approach based on conditional random field for multi-label image and video annotation

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Sequence-based kernels for online concept detection in video

AIEMPro '11 Proceedings of the 2011 ACM international workshop on Automated media analysis and production for novel TV services
Collaborative video reindexing via matrix factorization

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Sparse transfer learning for interactive video search reranking

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Joint-rerank: a novel method for image search reranking

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Semi-supervised multi-instance multi-label learning for video annotation task

Proceedings of the 20th ACM international conference on Multimedia
Time matters!: capturing variation in time in video using fisher kernels

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic video annotation is an important ingredient for semantic-level video browsing, search and navigation. Much attention has been paid to this topic in recent years. These researches have evolved through two paradigms. In the first paradigm, each concept is individually annotated by a pre-trained binary classifier. However, this method ignores the rich information between the video concepts and only achieves limited success. Evolved from the first paradigm, the methods in the second paradigm add an extra step on the top of the first individual classifiers to fuse the multiple detections of the concepts. However, the performance of these methods can be degraded by the error propagation incurred in the first step to the second fusion one. In this article, another paradigm of the video annotation method is proposed to address these problems. It simultaneously annotates the concepts as well as model correlations between them in one step by the proposed Correlative Multilabel (CML) method, which benefits from the compensation of complementary information between different labels. Furthermore, since the video clips are composed by temporally ordered frame sequences, we extend the proposed method to exploit the rich temporal information in the videos. Specifically, a temporal-kernel is incorporated into the CML method based on the discriminative information between Hidden Markov Models (HMMs) that are learned from the videos. We compare the performance between the proposed approach and the state-of-the-art approaches in the first and second paradigms on the widely used TRECVID data set. As to be shown, superior performance of the proposed method is gained.