Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
An introduction to variable and feature selection
The Journal of Machine Learning Research
Fast Binary Feature Selection with Conditional Mutual Information
The Journal of Machine Learning Research
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Breeding value classification in manchego sheep: a study of attribute selection and construction
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Speeding up incremental wrapper feature subset selection with Naive Bayes classifier
Knowledge-Based Systems
Hi-index | 0.00 |
This paper deals with the problem of supervised wrapperbased feature subset selection in datasets with a very large number of attributes. In such datasets sophisticated search algorithms like beam search, branch and bound, best first, genetic algorithms, etc., become intractable in the wrapper approach due to the high number of wrapper evaluations to be carried out. Thus, recently we can find in the literature the use of hybrid selection algorithms: based on a filter ranking, they perform an incremental wrapper selection over that ranking. Though working fine, these methods still have their own problems: (1) depending on the complexity of the wrapper search method, the number of wrapper evaluations can still be too large; and (2) they rely in an univariate ranking that does not take into account interaction between the variables already included in the selected subset and the remaining ones. In this paper we propose to work incrementally in two levels (block-level and attribute-level) in order to use a filter re-ranking method based on conditional mutual information, and the results show that we drastically reduce the number of wrapper evaluations without degrading the quality of the obtained subset (in fact we get the same accuracy but reducing the number of selected attributes).