SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
WWW '03 Proceedings of the 12th international conference on World Wide Web
Large margin hierarchical classification
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Freebase: a collaboratively created graph database for structuring human knowledge
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Learning to link with wikipedia
Proceedings of the 17th ACM conference on Information and knowledge management
Collective annotation of Wikipedia entities in web text
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Collective entity linking in web text: a graph-based method
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Robust disambiguation of named entities in text
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discriminative tag learning on YouTube videos with latent sub-tags
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Entity Disambiguation with Freebase
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
This paper presents a framework for categorizing channels of videos in a thematic taxonomy with high precision and coverage. The proposed approach consists of three main steps.First, videos are annotated by semantic entities describing their central topics. Second, semantic entities are mapped to categories using a combination of classifiers.Last, the categorization of channels is obtained by combining the results of both previous steps. This framework has been deployed on the whole corpus of YouTube, in 8 languages, and used to build several user facing products. Beyond the description of the framework, this paper gives insight into practical aspects and experience: rationale from product requirements to the choice of the solution, spam filtering, human-based evaluations of the quality of the results, and measured metrics on the live site.