Minimum Redundancy Feature Selection from Microarray Gene Expression Data
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
An introduction to variable and feature selection
The Journal of Machine Learning Research
Efficiently handling feature redundancy in high-dimensional data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining and Knowledge Discovery
A Study on the Importance of Differential Prioritization in Feature Selection Using Toy Datasets
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
OVA scheme vs. single machine approach in feature selection for microarray datasets
ICDM'06 Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining
PRIB'06 Proceedings of the 2006 international conference on Pattern Recognition in Bioinformatics
Hi-index | 0.00 |
The large number of genes in microarray data makes feature selection techniques more crucial than ever. From various ranking-based filter procedures to classifier-based wrapper techniques, many studies have devised their own flavor of feature selection techniques. Only a handful of the studies delved into the effect of redundancy in the predictor set on classification accuracy, and even fewer on the effect of varying the importance between relevance and redundancy. We present a filter-based feature selection technique which incorporates the three elements of relevance, redundancy and differential prioritization. With the aid of differential prioritization, our feature selection technique is capable of achieving better accuracies than those of previous studies, while using fewer genes in the predictor set. At the same time, the pitfalls of over-optimistic estimates of accuracy are avoided through the use of a more realistic evaluation procedure than the internal leave-one-out-cross-validation.