A maximum entropy approach to natural language processing
Computational Linguistics
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Restricted Bayes Optimal Classifiers
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Semisupervised Regression with Cotraining-Style Algorithms
IEEE Transactions on Knowledge and Data Engineering
Semi-supervised learning with very few labeled training examples
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Semi-supervised learning for multi-component data classification
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
New Labeling Strategy for Semi-supervised Document Categorization
KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Semi-supervised learning based on nearest neighbor rule and cut edges
Knowledge-Based Systems
A refinement approach to handling model misfit in semi-supervised learning
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
A hybrid generative/discriminative method for semi-supervised classification
Knowledge-Based Systems
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Hi-index | 0.00 |
Semi-supervised classifier design that simultaneously utilizes both labeled and unlabeled samples is a major research issue in machine learning. Existing semisupervised learning methods belong to either generative or discriminative approaches. This paper focuses on probabilistic semi-supervised classifier design and presents a hybrid approach to take advantage of the generative and discriminative approaches. Our formulation considers a generative model trained on labeled samples and a newly introduced bias correction model. Both models belong to the same model family. The proposed hybrid model is constructed by combining both generative and bias correction models based on the maximum entropy principle. The parameters of the bias correction model are estimated by using training data, and combination weights are estimated so that labeled samples are correctly classified. We use naive Bayes models as the generative models to apply the hybrid approach to text classification problems. In our experimental results on three text data sets, we confirmed that the proposed method significantly outperformed pure generative and discriminative methods when the classification performances of the both methods were comparable.