Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-Supervised Self-Training of Object Detection Models
WACV-MOTION '05 Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION'05) - Volume 1 - Volume 01
Semi-supervised learning with graphs
Semi-supervised learning with graphs
NeC4.5: Neural Ensemble Based C4.5
IEEE Transactions on Knowledge and Data Engineering
A Fault Prediction Model with Limited Fault Data to Improve Test Process
PROFES '08 Proceedings of the 9th international conference on Product-Focused Software Process Improvement
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Kernel-Based Transductive Learning with Nearest Neighbors
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Scaling up semi-supervised learning: an efficient and effective LLGC variant
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Learning Instance Weighted Naive Bayes from labeled and unlabeled data
Journal of Intelligent Information Systems
Foundations and Trends® in Computer Graphics and Vision
A document is known by the company it keeps: neighborhood consensus for short text categorization
Language Resources and Evaluation
Hi-index | 0.00 |
The development of data-mining applications such as textclassification and molecular profiling has shown the need for machine learning algorithms that can benefit from both labeled and unlabeled data, where often the unlabeled examples greatly outnumber the labeled examples. In this paper we present a two-stage classifier that improves its predictive accuracy by making use of the available unlabeled data. It uses a weighted nearest neighbor classification algorithm using the combined example-sets as a knowledge base. The examples from the unlabeled set are “pre-labeled” by an initial classifier that is build using the limited available training data. By choosing appropriate weights for this pre-labeled data, the nearest neighbor classifier consistently improves on the original classifier.