Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Proceedings of the 13th annual ACM international conference on Multimedia
Introduction to Information Retrieval
Introduction to Information Retrieval
Graph-based semi-supervised learning with multiple labels
Journal of Visual Communication and Image Representation
Short-term audio-visual atoms for generic video concept classification
MM '09 Proceedings of the 17th ACM international conference on Multimedia
NUS-WIDE: a real-world web image database from National University of Singapore
Proceedings of the ACM International Conference on Image and Video Retrieval
Exploring large scale data for multimedia QA: an initial study
Proceedings of the ACM International Conference on Image and Video Retrieval
Multiple Bernoulli relevance models for image and video annotation
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Automatic Concept Detector Refinement for Large-Scale Video Semantic Annotation
ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
The explosion of social video sharing sites gives new challenges on video search and indexing techniques. Because of the concept diversity in social videos, it is very hard to build a well annotated dataset that provides good coverage over the whole meaning of concepts. However, the prosperity of social videos on the internet also make it easy to obtain a huge number of videos, which gives an opportunity to mine the semantic content from an infinite amount of video entities. In this paper, we focus on improving the performance concept detectors and propose a refinement framework based on a semi-supervised learning technique. In our framework, the self-training algorithm is employed to expand the training dataset with automatically labeled data. The contribution of this paper is to demonstrate how to utilize the visual feature and text metadata to enhance the performance of concept classifier with a lot number of unlabeled videos. By experimenting on a social video dataset with 21,000 entities, it is shown that after expanding the training set with automatically labeled shots, the concept detectors' performance can be significantly improved.