Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Feature Selection for Unsupervised Learning
The Journal of Machine Learning Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Consensus unsupervised feature ranking from multiple views
Pattern Recognition Letters
A review of feature selection techniques in bioinformatics
Bioinformatics
Locality sensitive semi-supervised feature selection
Neurocomputing
Robust Feature Selection Using Ensemble Feature Selection Techniques
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Forward semi-supervised feature selection
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Co-training with relevant random subspaces
Neurocomputing
Discriminative semi-supervised feature selection via manifold regularization
IEEE Transactions on Neural Networks
Feature Selection for Unsupervised Learning Using Random Cluster Ensembles
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Combining committee-based semi-supervised learning and active learning
Journal of Computer Science and Technology
Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Hi-index | 0.10 |
We consider the problem of using a large amount of unlabeled data to improve the efficiency of feature selection in high-dimension when only a small amount of labeled examples is available. We propose a new method called semi-supervised ensemble learning guided feature ranking method (SEFR for short), that combines a bagged ensemble of standard semi-supervised approaches with a permutation-based out-of-bag feature importance measure that takes into account both labeled and unlabeled data. We provide empirical results on several benchmark data sets indicating that SEFR can lead to significant improvement over state-of-the-art supervised and semi-supervised algorithms.