Learning automatic concept detectors from online video

Authors:
Adrian Ulges;Christian Schulze;Markus Koch;Thomas M. Breuel
Affiliations:
German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122, 67663 Kaiserslautern, Germany;German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122, 67663 Kaiserslautern, Germany;Department of Computer Science, University of Kaiserslautern, Postfach 3049, 67653 Kaiserslautern, Germany;Department of Computer Science, University of Kaiserslautern, Postfach 3049, 67653 Kaiserslautern, Germany
Venue:
Computer Vision and Image Understanding
Year:
2010

Citing 27
Cited 11

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
News video classification using SVM-based multimodal classifiers and combination strategies

Proceedings of the tenth ACM international conference on Multimedia
Local Representations and a direct Voting Scheme for Face Recognition

PRIS '01 Proceedings of the 1st International Workshop on Pattern Recognition in Information Systems: In conjunction with ICEIS 2001
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
TRECVID: evaluating the effectiveness of information retrieval tasks on digital video

Proceedings of the 12th annual ACM international conference on Multimedia
Large-Scale Concept Ontology for Multimedia

IEEE MultiMedia
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Techniques used and open challenges to the analysis, indexing and retrieval of digital video

Information Systems
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
Classification of video events using 4-dimensional time-compressed motion features

Proceedings of the 6th ACM international conference on Image and video retrieval
Probabilistic model supported rank aggregation for the semantic concept detection in video

Proceedings of the 6th ACM international conference on Image and video retrieval
How many high-level concepts will fill the semantic gap in news video retrieval?

Proceedings of the 6th ACM international conference on Image and video retrieval
Video diver: generic video indexing with diverse features

Proceedings of the international workshop on Workshop on multimedia information retrieval
Large-scale multimodal semantic concept detection for consumer video

Proceedings of the international workshop on Workshop on multimedia information retrieval
Cross-domain video concept detection using adaptive svms

Proceedings of the 15th international conference on Multimedia
Identifying relevant frames in weakly labeled videos for training concept detectors

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
(Un)Reliability of video concept detection

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
A comparison of color features for visual concept classification

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
VideOlympics: Real-Time Evaluation of Multimedia Retrieval Systems

IEEE MultiMedia
Learning tag relevance by neighbor voting for social image retrieval

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Annotation of heterogeneous multimedia content using automatic speech recognition

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
A system that learns to tag videos by watching youtube

ICVS'08 Proceedings of the 6th international conference on Computer vision systems
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Large scale evaluations of multimedia information retrieval: the TRECVid experience

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
A Thousand Words in a Scene

IEEE Transactions on Pattern Analysis and Machine Intelligence
Color and texture descriptors

IEEE Transactions on Circuits and Systems for Video Technology

Editorial: Special issue on image and video retrieval evaluation

Computer Vision and Image Understanding
On the sampling of web images for learning visual concept classifiers

Proceedings of the ACM International Conference on Image and Video Retrieval
Social negative bootstrapping for visual categorization

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
ShotTagger: tag location for internet videos

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Lookapp: interactive construction of web-based concept detectors

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Enriching and localizing semantic tags in internet videos

MM '11 Proceedings of the 19th ACM international conference on Multimedia
In-video product annotation with web information mining

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Fuzzy rule-based reasoning approach for event detection and annotation of broadcast soccer video

Applied Soft Computing
Label-specific training set construction from web resource for image annotation

Signal Processing
"Tell me more": how semantic technologies can help refining internet image search

Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications
Large-scale visual sentiment ontology and detectors using adjective noun pairs

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Concept detection is targeted at automatically labeling video content with semantic concepts appearing in it, like objects, locations, or activities. While concept detectors have become key components in many research prototypes for content-based video retrieval, their practical use is limited by the need for large-scale annotated training sets. To overcome this problem, we propose to train concept detectors on material downloaded from web-based video sharing portals like YouTube, such that training is based on tags given by users during upload, no manual annotation is required, and concept detection can scale up to thousands of concepts. On the downside, web video as training material is a complex domain, and the tags associated with it are weak and unreliable. Consequently, performance loss is to be expected when replacing high-quality state-of-the-art training sets with web video content. This paper presents a concept detection prototype named TubeTagger that utilizes YouTube content for an autonomous training. In quantitative experiments, we compare the performance when training on web video and on standard datasets from the literature. It is demonstrated that concept detection in web video is feasible, and that - when testing on YouTube videos - the YouTube-based detector outperforms the ones trained on standard training sets. By applying the YouTube-based prototype to datasets from the literature, we further demonstrate that: (1) If training annotations on the target domain are available, the resulting detectors significantly outperform the YouTube-based tagger. (2) If no annotations are available, the YouTube-based detector achieves comparable performance to the ones trained on standard datasets (moderate relative performance losses of 11.4% is measured) while offering the advantage of a fully automatic, scalable learning. (3) By enriching conventional training sets with online video material, performance improvements of 11.7% can be achieved when generalizing to domains unseen in training.