Sensitivity analysis of rough classification
International Journal of Man-Machine Studies
Rough sets: probabilistic versus deterministic approach
Machine learning and uncertain reasoning
Variable precision rough set model
Journal of Computer and System Sciences
Learning Boolean concepts in the presence of many irrelevant features
Artificial Intelligence
Communications of the ACM
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Feature Selection Using Rough Sets Theory
ECML '93 Proceedings of the European Conference on Machine Learning
Induction of Strong Feature Subsets
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
On biases in estimating multi-valued attributes
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Scalable Feature Selection Using Rough Set Theory
RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
A novel feature selection method for large-scale data sets
Intelligent Data Analysis
Gene selection using rough set theory
RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Feature selection based on relative attribute dependency: an experimental study
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
Hi-index | 0.00 |
The problem of feature subset selection can be defined as theselection of a relevant subset of features which allows a learningalgorithm to induce small high-accuracy models. This problem is ofprimary important because irrelevant and redundant features maydegrade the learner speed, especially in the context of highdimensionality, and reduce both the accuracy and comprehensibilityof the induced model. Two main approaches have been developed, thefirst one is algorithm-independent (filter approach) whichconsiders only the data, when the second approach which isalgorithm-dependent takes into account both the data and a givenlearning algorithm (wrapper approach). Recent work was developed tostudy the interest of the rough set theory and more particularlyits notions of reducts and core to deal with the problem of featuresubset selection. Different methods were proposed to selectfeatures using both the core and the reduct concepts, whereas otherresearches show that useful feature subsets do not necessarilycontain all features in cores. In this paper, we underline the factthat rough set theory is concerned with deterministic analysis ofattribute dependencies which are at the basis of the two notions ofreduct and core. We extend the notion of dependency which allows tofind both deterministic and non-deterministic dependencies. A newnotion of strong reducts is then introduced and leads to thedefinition of strong feature subsets (SFS). The interest of SFS isillustrated by the improvement of the accuracy of C4.5 onreal-world datasets. Our study shows that generally thehighest-accuracy-subset is not the best one as regards to thefilter criteria. The highest accuracy subset is found by the newapproach with minimum cost. The contribution of this work is fourfolds : (1) analysis of feature subset selection in the rough setscontext, (2) introduction of new definitions based on a generalizedrough set theory, i.e., \alpha-RST, (3) reformulation of theselection problem, (4) description of a hybrid method combiningcombining both the filter and the wrapper approaches.