HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering
Proceedings of the the seventh ACM conference on Hypertext
An algorithm for suffix stripping
Readings in information retrieval
A Winnow-Based Approach to Context-Sensitive Spelling Correction
Machine Learning - Special issue on natural language learning
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
Frequent term-based text clustering
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Graph-based hierarchical conceptual clustering
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Proceedings of the 13th international conference on World Wide Web
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Incremental hierarchical clustering of text documents
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Enhancing clustering blog documents by utilizing author/reader comments
ACM-SE 45 Proceedings of the 45th annual southeast regional conference
Tubekit: a query-based youtube crawling toolkit
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Ranking Comments on the Social Web
CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 04
Hierarchical Bayesian clustering for automatic text classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
SIAM: social interaction analysis for multimedia
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Comment-based multi-view clustering of web 2.0 items
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Information resources on the Web like videos, images, and documents are increasingly becoming more "social" through user engagement via commenting systems. These commenting systems provide a forum for users to discuss the resources but have the side effect of providing valuable editorial and contextual information about the resources. In this paper, we explore a comments-driven clustering framework for organizing Web resources according to this user-based perspective. Concretely, we propose a hierarchical comment clustering approach that relies on two key features: (i) comment term normalization and key term extraction for distilling noisy comments for effective clustering; and (ii) a real-time insertion component for incrementally updating the comments-based hierarchy so that resources can be efficiently placed in the hierarchy as comments arise and without the need to re-generate the (potentially) expensive hierarchy. We study the clustering approach over the popular video sharing site YouTube. YouTube is a challenging and difficult environment, notorious for its extremely short, ill-formed, and often unintelligible user-contributed comments. Through extensive experimental study, we find that the proposed approach can lead to effective and efficient comments-based video organizing even in a YouTube-like environment.