Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Enhancing Supervised Learning with Unlabeled Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers
IEEE Transactions on Knowledge and Data Engineering
The Research on Chinese Coreference Resolution Based on Maximum Entropy Model and Rules
WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Semi-supervised learning by disagreement
Knowledge and Information Systems
When Does Cotraining Work in Real Data?
IEEE Transactions on Knowledge and Data Engineering
The Journal of Machine Learning Research
Web classification of conceptual entities using co-training
Expert Systems with Applications: An International Journal
Cross-domain video concept detection: A joint discriminative and generative active learning approach
Expert Systems with Applications: An International Journal
CoTrade: Confident Co-Training With Data Editing
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
Co-training is a good paradigm of semi-supervised, which requires the data set to be described by two views of features. There are a notable characteristic shared by many co-training algorithm: the selected unlabeled instances should be predicted with high confidence, since a high confidence score usually implies that the corresponding prediction is correct. Unfortunately, it is not always able to improve the classification performance with these high confidence unlabeled instances. In this paper, a new semi-supervised learning algorithm was proposed combining the benefits of both co-training and active learning. The algorithm applies co-training to select the most reliable instances according to the two criterions of high confidence and nearest neighbor for boosting the classifier, also exploit the most informative instances with human annotation for improve the classification performance. Experiments on several UCI data sets and natural language processing task, which demonstrate our method achieves more significant improvement for sacrificing the same amount of human effort.