On the merits of building categorization systems by supervised clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 10th international conference on World Wide Web
Introduction to algorithms
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Support Vector Machines Based on a Semantic Kernel for Text Categorization
IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 5 - Volume 5
Bootstrapping for hierarchical document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Web taxonomy integration through co-bootstrapping
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A practical web-based approach to generating topic hierarchy for text segments
Proceedings of the thirteenth ACM international conference on Information and knowledge management
InfoAnalyzer: a computer-aided tool for building enterprise taxonomies
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Automatically learning document taxonomies for hierarchical classification
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Tree-cut and a lexicon based on systematic polysemy
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Taxonomies by the numbers: building high-performance taxonomies
Proceedings of the 14th ACM international conference on Information and knowledge management
Acclimatizing Taxonomic Semantics for Hierarchical Content Classification
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 18th ACM conference on Information and knowledge management
Improving taxonomies for large-scale hierarchical classifiers of web documents
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hierarchy evolution for improved classification
Proceedings of the 20th ACM international conference on Information and knowledge management
ACM Transactions on Information Systems (TOIS)
An evaluation of classification models for question topic categorization
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
Category hierarchies often evolve at a much slower pace than the documents reside in. With newly available documents kept adding into a hierarchy, new topics emerge and documents within the same category become less topically cohesive. In this paper, we propose a novel automatic approach to modifying a given category hierarchy by redistributing its documents into more topically cohesive categories. The modification is achieved with three operations (namely, sprout, merge, and assign) with reference to an auxiliary hierarchy for additional semantic information; the auxiliary hierarchy covers a similar set of topics as the hierarchy to be modified. Our user study shows that the modified category hierarchy is semantically meaningful. As an extrinsic evaluation, we conduct experiments on document classification using real data from Yahoo! Answers and AnswerBag hierarchies, and compare the classification accuracies obtained on the original and the modified hierarchies. Our experiments show that the proposed method achieves much larger classification accuracy improvement compared with several baseline methods for hierarchy modification.