Information-based objective functions for active data selection
Neural Computation
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-Supervised Learning on Riemannian Manifolds
Machine Learning
Activity Recognition Based on Semi-supervised Learning
RTCSA '07 Proceedings of the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
Improving supervised learning performance by using fuzzy clustering method to select training data
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Fuzzy theory and technology with applications
Active Learning for High Throughput Screening
DS '08 Proceedings of the 11th International Conference on Discovery Science
Representative sampling for text classification using support vector machines
ECIR'03 Proceedings of the 25th European conference on IR research
Learning with unlabeled data and its application to image retrieval
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Semi-supervised learning by disagreement
Knowledge and Information Systems
An extension of the aspect PLSA model to active and semi-supervised learning for text classification
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
Hi-index | 0.00 |
Semi-supervised leaning deals with methods for automatically exploiting unlabeled samples in addition to labeled set. The data selection is an important topic in active learning. It addresses the selection the valuable unlabeled data to label, considering that labeling data is a costly job. In this paper, we want to discuss in detail three aspects of technology in data selection, which includes how to select the unlabeled sample, how many unlabeled samples should be selected and how to define the capacity of the training pool. Experiments which use self-training based on C4.5 show that while the L labeled ratio lager continuous, the initial error value becomes smaller. Also when L labeled ratio is less than 10%, the selection ratio value should be set in less than 0.8.The error value has no significant change while selection ratio value larger than 1.0.