Elements of information theory
Elements of information theory
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Machine Learning - Special issue on learning with probabilistic representations
On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems
Theoretical Computer Science
Feature subset selection by Bayesian network-based optimization
Artificial Intelligence
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
Partially Supervised Classification of Text Documents
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Building Text Classifiers Using Positive and Unlabeled Examples
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Splice site identification by idlBNs
Bioinformatics
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
Computer Methods and Programs in Biomedicine
Learning Bayesian classifiers from positive and unlabeled examples
Pattern Recognition Letters
A review of feature selection techniques in bioinformatics
Bioinformatics
A pairwise ranking based approach to learning with positive and unlabeled examples
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.10 |
The feature subset selection problem has a growing importance in many machine learning applications where the amount of variables is very high. There is a great number of algorithms that can approach this problem in supervised databases but, when examples from one or more classes are not available, supervised feature subset selection algorithms cannot be directly applied. One of these algorithms is the correlation based filter selection (CFS). In this work we propose an adaptation of this algorithm that can be applied when only positive and unlabelled examples are available. As far as we know, this is the first time the feature subset selection problem is studied in the positive unlabelled learning context. We have tested this adaptation on synthetic datasets obtained by sampling Bayesian network models where we know which variables are (in)dependent of the class. We have also tested our adaptations on real-life databases where the absence of negative examples has been simulated. The results show that, having enough positive examples, it is possible to obtain good solutions to the feature subset selection problem when only positive and unlabelled instances are available.