Selective Sampling Using the Query by Committee Algorithm
Machine Learning
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Support Vector Machine Active Learning with Application sto Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Selective Sampling with Redundant Views
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Learning from Labeled and Unlabeled Data using Graph Mincuts
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Applying co-training methods to statistical parsing
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers
IEEE Transactions on Knowledge and Data Engineering
Two-view feature generation model for semi-supervised learning
Proceedings of the 24th international conference on Machine learning
On multi-view active learning and the combination with semi-supervised learning
Proceedings of the 25th international conference on Machine learning
Multi-view regression via canonical correlation analysis
COLT'07 Proceedings of the 20th annual conference on Learning theory
Hi-index | 0.00 |
Semi-supervised learning algorithms are widely used to build strong learning models when there are not enough labeled instances. Some semi-supervised learning algorithms, including co-training and co-EM, use multiple views to build learning models. Past research has shown that multi-view learning usually shows advantages over learning with a single view. However, conditions such as independence of the views, which is often hard to achieve, makes successful application of such methods in real world problems difficult. We would like to know if the performance of a multi-view semi-supervised learning method can be improved even though the available views are not necessarily independent. In this paper, we propose a simple sampling method, agreement-based sampling, as one way of improving the classification performance of multi-view semi-supervised learning algorithms. We apply agreement-based sampling to three major multi-view semi-supervised learning algorithms: co-training, co-EM and multi-view semi-supervised ensemble. The experiments with real-world datasets show that our sampling method can indeed significantly improve the performance of these three algorithms. We also investigated the relation between the quality of the expanded labeled set generated by the semi-supervised learning algorithm and the classification error rate of the learned model.