Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Generating hierarchical summaries for web searches
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Liveclassifier: creating hierarchical text classifiers through web corpora
Proceedings of the 13th international conference on World Wide Web
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Retrieval models for question and answer archives
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Deep classification in large-scale text hierarchies
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
User generated content: how good is it?
Proceedings of the 3rd workshop on Information credibility on the web
Enhancing cluster labeling using wikipedia
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A syntactic tree matching approach to finding similar questions in community-based qa services
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Exploiting internal and external semantics for the clustering of short texts using world knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
A metric-based framework for automatic taxonomy induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Detecting health events on the social web to enable epidemic intelligence
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Large-scale question classification in cQA by leveraging Wikipedia semantic knowledge
Proceedings of the 20th ACM international conference on Information and knowledge management
Beyond precision@10: clustering the long tail of web search results
Proceedings of the 20th ACM international conference on Information and knowledge management
Cluster labeling for multilingual scatter/gather using comparable corpora
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
SSHLDA: a semi-supervised hierarchical topic model
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Exploring the existing category hierarchy to automatically label the newly-arising topics in cQA
Proceedings of the 21st ACM international conference on Information and knowledge management
Topic hierarchy construction for the organization of multi-source user generated contents
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Information Sciences: an International Journal
Hi-index | 0.00 |
This paper presents a novel prototype hierarchy based clustering (PHC) framework for the organization of web collections. It solves simultaneously the problem of categorizing web collections and interpreting the clustering results for navigation. By utilizing prototype hierarchies and the underlying topic structures of the collections, PHC is modeled as a multi-criterion optimization problem based on minimizing the hierarchy evolution, maximizing category cohesiveness and inter-hierarchy structural and semantic resemblance. The flexible design of metrics enables PHC to be a general framework for applications in various domains. In the experiments on categorizing 4 collections of distinct domains, PHC achieves 30% improvement in ¼F1 over the state-of-the-art techniques. Further experiments provide insights on performance variations with abstract and concrete domains, completeness of the prototype hierarchy, and effects of different combinations of optimization criteria.