Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos

Authors:
Rong Yan;Milind Naphade
Affiliations:
Carnegie Mellon University;IBM TJ Watson Research Center
Venue:
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Year:
2005

Citing 0
Cited 34

Multiple instance learning for labeling faces in broadcasting news video

Proceedings of the 13th annual ACM international conference on Multimedia
To construct optimal training set for video annotation

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Automatic video annotation by semi-supervised learning with kernel density estimation

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Co-Adaptation of audio-visual speech and gesture classifiers

Proceedings of the 8th international conference on Multimodal interfaces
Semi-supervised learning for semantic video retrieval

Proceedings of the 6th ACM international conference on Image and video retrieval
Adapting appearance models of semantic concepts to particular videos via transductive learning

Proceedings of the international workshop on Workshop on multimedia information retrieval
TV ad video categorization with probabilistic latent concept learning

Proceedings of the international workshop on Workshop on multimedia information retrieval
Structure-sensitive manifold ranking for video concept detection

Proceedings of the 15th international conference on Multimedia
Optimizing multi-graph learning: towards a unified video annotation scheme

Proceedings of the 15th international conference on Multimedia
Optimizing training set construction for video semantic classification

EURASIP Journal on Advances in Signal Processing
Exploring knowledge of sub-domain in a multi-resolution bootstrapping framework for concept detection in news video

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Transductive multi-label learning for video concept detection

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Semi-supervised kernel density estimation for video annotation

Computer Vision and Image Understanding
Detecting Violent Scenes in Movies by Auditory and Visual Cues

PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Video semantic analysis based on structure-sensitive anisotropic manifold ranking

Signal Processing
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Transactions on Multimedia - Special section on communities and media computing
Tensor-based transductive learning for multimodality video semantic concept detection

IEEE Transactions on Multimedia
Robust semantic concept detection in large video collections

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Kernel-based linear neighborhood propagation for semantic video annotation

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
MAPACo-training: a novel online learning algorithm of behavior models

ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
MILC2: a multi-layer multi-instance learning approach to video concept detection

MMM'08 Proceedings of the 14th international conference on Advances in multimedia modeling
On the sampling of web images for learning visual concept classifiers

Proceedings of the ACM International Conference on Image and Video Retrieval
Thematic video thumbnail selection

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Improving video concept detection using spatio-temporal correlation

PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Learning to recognize objects from unseen modalities

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
A transductive multi-label learning approach for video concept detection

Pattern Recognition
Collaborative track analysis, data cleansing, and labeling

ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part I
Semi-supervised multi-instance multi-label learning for video annotation task

Proceedings of the 20th ACM international conference on Multimedia
Web page and image semi-supervised classification with heterogeneous information fusion

Journal of Information Science
Violence detection in hollywood movies by the fusion of visual and mid-level audio cues

Proceedings of the 21st ACM international conference on Multimedia
Multi-view semi-supervised web image classification via co-graph

Neurocomputing
Kernel-based transition probability toward similarity measure for semi-supervised learning

Pattern Recognition
Selection of negative samples and two-stage combination of multiple features for action detection in thousands of videos

Machine Vision and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

For large scale automatic semantic video characterization, it is necessary to learn and model a large number of semantic concepts. But a major obstacle to this is the insufficiency of labeled training samples. Multi-view semi-supervised learning algorithms such as co-training may help by incorporating a large amount of unlabeled data. However, one of their assumptions requiring that each view be sufficient for learning is usually violated in semantic concept detection. In this paper, we propose a novel multi-view semi-supervised learning algorithm called semi-supervised cross feature learning(SCFL). The proposed algorithm has two advantages over co-training. First, SCFL can theoretically guarantee its performance not being significantly degraded even when the assumption of view sufficiency fails. Also, SCFL can also handle additional views of unlabeled data even when these views are absent from the training data. As demonstrated in the TRECVIDý03 semantic concept extraction task, the proposed SCFL algorithm not onlysignificantly outperforms the conventional co-training algorithms, but also comes close to achieving the performance when the unlabeled set were to be manually annotated and used for training along with the labeled data set.