IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
Tag confidence measure for semi-automatically updating named entity recognition
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Help-Training for semi-supervised support vector machines
Pattern Recognition
Locally discriminative topic modeling
Pattern Recognition
Efficient semi-supervised learning on locally informative multiple graphs
Pattern Recognition
Pattern Recognition Letters
Topic evolution prediction of user generated contents considering enterprise generated contents
Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research
Bidirectional semi-supervised learning with graphs
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A second order cone programming approach for semi-supervised learning
Pattern Recognition
Hi-index | 0.14 |
This paper presents a method for designing semi-supervised classifiers trained on labeled and unlabeled samples. We focus on probabilistic semi-supervised classifier design for multi-class and singlelabeled classification problems, and propose a hybrid approach that takes advantage of generative and discriminative approaches. In our approach, we first consider a generative model trained by using labeled samples and introduce a bias correction model, where these models belong to the same model family, but have different parameters. Then, we construct a hybrid classifier by combining these models based on the maximum entropy principle. To enable us to apply our hybrid approach to text classification problems, we employed naive Bayes models as the generative and bias correction models. Our experimental results for four text data sets confirmed that the generalization ability of our hybrid classifier was much improved by using a large number of unlabeled samples for training when there were too few labeled samples to obtain good performance. We also confirmed that our hybrid approach significantly outperformed generative and discriminative approaches when the performance of the generative and discriminative approaches was comparable. Moreover, we examined the performance of our hybrid classifier when the labeled and unlabeled data distributions were different.