Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
A Trainable System for Object Detection
International Journal of Computer Vision - special issue on learning and vision at the center for biological and computational learning, Massachusetts Institute of Technology
Mean Shift: A Robust Approach Toward Feature Space Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
International Journal of Computer Vision
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Content-boosted collaborative filtering for improved recommendations
Eighteenth national conference on Artificial intelligence
Efficient matching and clustering of video shots
ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1
Mean Shift Based Clustering in High Dimensions: A Texture Classification Example
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
The CMU Pose, Illumination, and Expression Database
IEEE Transactions on Pattern Analysis and Machine Intelligence
Labeling images with a computer game
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Semi-Supervised Self-Training of Object Detection Models
WACV-MOTION '05 Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION'05) - Volume 1 - Volume 01
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Label propagation through linear neighborhoods
ICML '06 Proceedings of the 23rd international conference on Machine learning
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Video abstraction: A systematic review and classification
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
The Journal of Machine Learning Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
LabelMe: A Database and Web-Based Tool for Image Annotation
International Journal of Computer Vision
Get another label? improving data quality and data mining using multiple, noisy labelers
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Veritas: Combining Expert Opinions without Labeled Data
ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 01
Matchin: eliciting user preferences with an online game
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Unsupervised and semi-supervised multi-class support vector machines
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Inferring semantic concepts from community-contributed images and noisy tags
MM '09 Proceedings of the 17th ACM international conference on Multimedia
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision
The Journal of Machine Learning Research
Regression Learning with Multiple Noisy Oracles
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Efficient large-scale image annotation by probabilistic collaborative multi-label propagation
Proceedings of the international conference on Multimedia
Tiny Videos: A Large Data Set for Nonparametric Video Retrieval and Frame Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Large-scale live active learning: Training object detectors with crawled data and crowds
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
A novel video key-frame-extraction algorithm based on perceived motion energy model
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
We address the problem of predicting category labels for unlabeled videos in a large video dataset by using a ground-truth set of objectively labeled videos that we have created. Large video databases like YouTube require that a user uploading a new video assign to it a category label from a prescribed set of labels. Such category labeling is likely to be corrupted by the subjective biases of the uploader. Despite their noisy nature, these subjective labels are frequently used as gold standard in algorithms for multimedia classification and retrieval. Our goal in this paper is NOT to propose yet another algorithm that predicts labels for unseen videos based on the subjective ground-truth. On the other hand, our goal is to demonstrate that the video classification performance can be improved if instead of using subjective labels, we first create an objectively labeled ground-truth set of videos and then train a classifier based on such a ground-truth so as to predict objective labels for the set of unlabeled videos. With regard to how we generate the objectively-labeled ground-truth dataset, we base it on the notion that when a video is labeled by a panel of diverse individuals, the majority opinion rendered by the panel may be taken to be the objective opinion. In this manner, using judgments provided by multiple human annotators, we have collected objective labels for a ground-truth dataset consisting of randomly-selected 1000 videos from the TinyVideos database that contains roughly 52,000 videos from YouTube (courtesy of Karpenko and Aarabi [1]). Through a fourfold cross-validation experiment on the ground-truth set, we demonstrate that the objective labels have a superior consistency compared to the subjective labels when used for video classification. We show that this claim is valid for several different kinds of feature sets that one can use to compare videos and with two different types of classifiers that one can use for label prediction. Subsequently, we use the ground-truth dataset of 1000 videos to predict the objective category labels of the remaining 51,000 videos. We compare the objective labels thus determined with the subjective labels provided by the video uploaders and qualitatively argue for the more informative nature of the objective labels.