Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Learning to classify text from labeled and unlabeled documents
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Making large-scale support vector machine learning practical
Advances in kernel methods
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
Exploiting Hierarchy in Text Categorization
Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Statistical Models for Co-occurrence Data
Statistical Models for Co-occurrence Data
A Hierarchical Model for Clustering and Categorising Documents
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Hierarchical document categorization with support vector machines
Proceedings of the thirteenth ACM international conference on Information and knowledge management
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Incorporating large unlabeled data to enhance EM classification
Journal of Intelligent Information Systems
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Unsupervised learning of field segmentation models for information extraction
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Hierarchical mixture models: a probabilistic analysis
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Hierarchical document classification using automatically generated hierarchy
Journal of Intelligent Information Systems
Topic taxonomy adaptation for group profiling
ACM Transactions on Knowledge Discovery from Data (TKDD)
Discovering relationships among categories using misclassification information
Proceedings of the 2008 ACM symposium on Applied computing
Boosting multi-label hierarchical text categorization
Information Retrieval
Hierarchical semantic classification: word sense disambiguation with world knowledge
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Japanese text classification using N-gram and the maximum ratio of term frequency among categories
ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
Large-scale hierarchical text classification without labelled data
Proceedings of the fourth ACM international conference on Web search and data mining
Text classification for data loss prevention
PETS'11 Proceedings of the 11th international conference on Privacy enhancing technologies
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Class normalization in centroid-based text categorization
Information Sciences: an International Journal
Clustering and categorization of Brazilian portuguese legal documents
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Integrated instance- and class-based generative modeling for text classification
Proceedings of the 18th Australasian Document Computing Symposium
Intelligent Data Analysis
Hi-index | 0.00 |
Documents are commonly categorized into hierarchies of topics, such as the ones maintained by Yahoo! and the Open Directory project, in order to facilitate browsing and other interactive forms of information retrieval. In addition, topic hierarchies can be utilized to overcome the sparseness problem in text categorization with a large number of categories, which is the main focus of this paper. This paper presents a hierarchical mixture model which extends the standard naive Bayes classifier and previous hierarchical approaches. Improved estimates of the term distributions are made by differentiation of words in the hierarchy according to their level of generality/specificity. Experiments on the Newsgroups and the Reuters-21578 dataset indicate improved performance of the proposed classifier in comparison to other state-of-the-art methods on datasets with a small number of positive examples.