Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Active + Semi-supervised Learning = Robust Multi-View Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Active learning with multiple views
Active learning with multiple views
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hi-index | 0.00 |
The supervised machine learning approach usually requires a large number of labelled examples to learn accurately. However, labelling can be a costly and time consuming process, especially when manually performed. In contrast, unlabelled examples are usually inexpensive and easy to obtain. This is the case for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Semi-supervised learning, a relatively new area in machine learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labelled data whenever only a small set of labelled examples is available. Multi-view semi-supervised learning requires a partitioned description of each example into at least two distinct views. In this work, we propose a simple approach for textual documents pre-processing in order to easily construct the two different views required by any multi-view learning algorithm. Experimental results related to text classification are described, suggesting that our proposal to construct the views performs well in practice.