Text classification in a hierarchical mixture model for small training sets

Authors:
Kristina Toutanova;Francine Chen;Kris Popat;Thomas Hofmann
Affiliations:
Stanford University, Stanford, CA;Xerox PARC, Palo Alto, CA;Xerox PARC, Palo Alto, CA;Brown University, Providence, RI
Venue:
Proceedings of the tenth international conference on Information and knowledge management
Year:
2001

Citing 14
Cited 21

Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
Learning to classify text from labeled and unlabeled documents

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Making large-scale support vector machine learning practical

Advances in kernel methods
Foundations of statistical natural language processing

Foundations of statistical natural language processing
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning

Machine Learning
Exploiting Hierarchy in Text Categorization

Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Statistical Models for Co-occurrence Data

Statistical Models for Co-occurrence Data

A Hierarchical Model for Clustering and Categorising Documents

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Hierarchical document categorization with support vector machines

Proceedings of the thirteenth ACM international conference on Information and knowledge management
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles

Journal of the American Society for Information Science and Technology
Combining labelled and unlabelled data: a case study on fisher kernels and transductive inference for biological entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Incorporating large unlabeled data to enhance EM classification

Journal of Intelligent Information Systems
Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Unsupervised learning of field segmentation models for information extraction

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Hierarchical mixture models: a probabilistic analysis

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Hierarchical document classification using automatically generated hierarchy

Journal of Intelligent Information Systems
Topic taxonomy adaptation for group profiling

ACM Transactions on Knowledge Discovery from Data (TKDD)
Discovering relationships among categories using misclassification information

Proceedings of the 2008 ACM symposium on Applied computing
Boosting multi-label hierarchical text categorization

Information Retrieval
Hierarchical semantic classification: word sense disambiguation with world knowledge

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Japanese text classification using N-gram and the maximum ratio of term frequency among categories

ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
Large-scale hierarchical text classification without labelled data

Proceedings of the fourth ACM international conference on Web search and data mining
Text classification for data loss prevention

PETS'11 Proceedings of the 11th international conference on Privacy enhancing technologies
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Class normalization in centroid-based text categorization

Information Sciences: an International Journal
Clustering and categorization of Brazilian portuguese legal documents

PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Integrated instance- and class-based generative modeling for text classification

Proceedings of the 18th Australasian Document Computing Symposium
Adapting non-hierarchical multilabel classification methods for hierarchical multilabel classification

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Documents are commonly categorized into hierarchies of topics, such as the ones maintained by Yahoo! and the Open Directory project, in order to facilitate browsing and other interactive forms of information retrieval. In addition, topic hierarchies can be utilized to overcome the sparseness problem in text categorization with a large number of categories, which is the main focus of this paper. This paper presents a hierarchical mixture model which extends the standard naive Bayes classifier and previous hierarchical approaches. Improved estimates of the term distributions are made by differentiation of words in the hierarchy according to their level of generality/specificity. Experiments on the Newsgroups and the Reuters-21578 dataset indicate improved performance of the proposed classifier in comparison to other state-of-the-art methods on datasets with a small number of positive examples.