Deterministic annealing EM algorithm
Neural Networks
Pairwise classification and support vector machines
Advances in kernel methods
Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Advances in Large Margin Classifiers
Advances in Large Margin Classifiers
A new discriminative kernel from probabilistic models
Neural Computation
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Statistical Models for Co-occurrence Data
Statistical Models for Co-occurrence Data
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Semi-Supervised Learning to Classify Evaluative Expressions from Labeled and Unlabeled Texts
IEICE - Transactions on Information and Systems
Application of semi-supervised learning to evaluative expression classification
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
We propose to use both labeled and unlabeled data with the Expectation-Maximization (EM) algorithm in order to estimate the generative model and use this model to construct a Fisher kernel. The Naive Bayes generative probability is used to model a document. Through the experiments of text categorization, we empirically show that, (a) the Fisher kernel with labeled and unlabeled data outperforms Naive Bayes classifiers with EM and other methods for a sufficient amount of labeled data, (b) the value of additional unlabeled data diminishes when the labeled data size is large enough for estimating a reliable model, (c) the use of categories as latent variables is effective, and (d) larger unlabeled training datasets yield better results.