Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
News video classification using SVM-based multimodal classifiers and combination strategies
Proceedings of the tenth ACM international conference on Multimedia
Local Representations and a direct Voting Scheme for Face Recognition
PRIS '01 Proceedings of the 1st International Workshop on Pattern Recognition in Information Systems: In conjunction with ICEIS 2001
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
TRECVID: evaluating the effectiveness of information retrieval tasks on digital video
Proceedings of the 12th annual ACM international conference on Multimedia
Large-Scale Concept Ontology for Multimedia
IEEE MultiMedia
The challenge problem for automated detection of 101 semantic concepts in multimedia
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
International Journal of Computer Vision
Classification of video events using 4-dimensional time-compressed motion features
Proceedings of the 6th ACM international conference on Image and video retrieval
Probabilistic model supported rank aggregation for the semantic concept detection in video
Proceedings of the 6th ACM international conference on Image and video retrieval
How many high-level concepts will fill the semantic gap in news video retrieval?
Proceedings of the 6th ACM international conference on Image and video retrieval
Video diver: generic video indexing with diverse features
Proceedings of the international workshop on Workshop on multimedia information retrieval
Large-scale multimodal semantic concept detection for consumer video
Proceedings of the international workshop on Workshop on multimedia information retrieval
Cross-domain video concept detection using adaptive svms
Proceedings of the 15th international conference on Multimedia
Identifying relevant frames in weakly labeled videos for training concept detectors
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
(Un)Reliability of video concept detection
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
A comparison of color features for visual concept classification
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Learning tag relevance by neighbor voting for social image retrieval
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Annotation of heterogeneous multimedia content using automatic speech recognition
SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
A system that learns to tag videos by watching youtube
ICVS'08 Proceedings of the 6th international conference on Computer vision systems
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Large scale evaluations of multimedia information retrieval: the TRECVid experience
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Circuits and Systems for Video Technology
Editorial: Special issue on image and video retrieval evaluation
Computer Vision and Image Understanding
On the sampling of web images for learning visual concept classifiers
Proceedings of the ACM International Conference on Image and Video Retrieval
Social negative bootstrapping for visual categorization
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
ShotTagger: tag location for internet videos
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Lookapp: interactive construction of web-based concept detectors
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Enriching and localizing semantic tags in internet videos
MM '11 Proceedings of the 19th ACM international conference on Multimedia
In-video product annotation with web information mining
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
"Tell me more": how semantic technologies can help refining internet image search
Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications
Large-scale visual sentiment ontology and detectors using adjective noun pairs
Proceedings of the 21st ACM international conference on Multimedia
Hi-index | 0.00 |
Concept detection is targeted at automatically labeling video content with semantic concepts appearing in it, like objects, locations, or activities. While concept detectors have become key components in many research prototypes for content-based video retrieval, their practical use is limited by the need for large-scale annotated training sets. To overcome this problem, we propose to train concept detectors on material downloaded from web-based video sharing portals like YouTube, such that training is based on tags given by users during upload, no manual annotation is required, and concept detection can scale up to thousands of concepts. On the downside, web video as training material is a complex domain, and the tags associated with it are weak and unreliable. Consequently, performance loss is to be expected when replacing high-quality state-of-the-art training sets with web video content. This paper presents a concept detection prototype named TubeTagger that utilizes YouTube content for an autonomous training. In quantitative experiments, we compare the performance when training on web video and on standard datasets from the literature. It is demonstrated that concept detection in web video is feasible, and that - when testing on YouTube videos - the YouTube-based detector outperforms the ones trained on standard training sets. By applying the YouTube-based prototype to datasets from the literature, we further demonstrate that: (1) If training annotations on the target domain are available, the resulting detectors significantly outperform the YouTube-based tagger. (2) If no annotations are available, the YouTube-based detector achieves comparable performance to the ones trained on standard datasets (moderate relative performance losses of 11.4% is measured) while offering the advantage of a fully automatic, scalable learning. (3) By enriching conventional training sets with online video material, performance improvements of 11.7% can be achieved when generalizing to domains unseen in training.