Multi-view Semi-supervised Learning: An Approach to Obtain Different Views from Text Datasets

Authors:
Edson Takashi Matsubara;Maria Carolina Monard;Gustavo E. A. P. A. Batista
Affiliations:
University of São Paulo --USP, Institute of Mathematics and Computer Science --ICMC, Laboratory of Computational Intelligence --LABIC, P.O. Box 668, 13560-970, São Carlos, SP, Brazil, {e ...;University of São Paulo --USP, Institute of Mathematics and Computer Science --ICMC, Laboratory of Computational Intelligence --LABIC, P.O. Box 668, 13560-970, São Carlos, SP, Brazil, {e ...;University of São Paulo --USP, Institute of Mathematics and Computer Science --ICMC, Laboratory of Computational Intelligence --LABIC, P.O. Box 668, 13560-970, São Carlos, SP, Brazil, {e ...
Venue:
Proceedings of the 2005 conference on Advances in Logic Based Intelligent Systems: Selected Papers of LAPTEC 2005
Year:
2005

Citing 7
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Active + Semi-supervised Learning = Robust Multi-View Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Active learning with multiple views

Active learning with multiple views
Co-EM support vector learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

The supervised machine learning approach usually requires a large number of labelled examples to learn accurately. However, labelling can be a costly and time consuming process, especially when manually performed. In contrast, unlabelled examples are usually inexpensive and easy to obtain. This is the case for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Semi-supervised learning, a relatively new area in machine learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labelled data whenever only a small set of labelled examples is available. Multi-view semi-supervised learning requires a partitioned description of each example into at least two distinct views. In this work, we propose a simple approach for textual documents pre-processing in order to easily construct the two different views required by any multi-view learning algorithm. Experimental results related to text classification are described, suggesting that our proposal to construct the views performs well in practice.