A probability distribution model for information retrieval
Information Processing and Management: an International Journal - Modeling data, information and knowledge
Models for retrieval with probabilistic indexing
Information Processing and Management: an International Journal - Modeling data, information and knowledge
Poor estimates of context are worse than none
HLT '90 Proceedings of the workshop on Speech and Natural Language
An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
A comparison of classifiers and document representations for the routing problem
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of new and old algorithms for a mixture estimation problem
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
A randomized approximation of the MDL for stochastic models with hidden variables
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Document classification by machine: theory and practice
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Text classification using ESC-based stochastic decision lists
Proceedings of the eighth international conference on Information and knowledge management
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Text classification using ESC-based stochastic decision lists
Information Processing and Management: an International Journal
Topic analysis using a finite mixture model
Information Processing and Management: an International Journal
Improving text categorization using the importance of sentences
Information Processing and Management: an International Journal
Automatic text categorization by unsupervised learning
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Dominant meanings classification model for web information
Design and application of hybrid intelligent systems
Topic analysis using a finite mixture model
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Automatic classification of web pages into bookmark categories
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Time, topic and trawl: stories about how we reach our past
Proceedings of the Designing Interactive Systems Conference
Contextual and active learning-based affect-sensing from virtual drama improvisation
ACM Transactions on Speech and Language Processing (TSLP)
Hi-index | 0.00 |
We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of classifying documents as that of conducting statistical hypothesis testing over finite mixture models, and employ the EM algorithm to efficiently estimate parameters in a finite mixture model. Experimental results indicate that our method outperforms existing methods.