Recall-precision trade-off: a derivation
Journal of the American Society for Information Science
A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The relationship between recall and precision
Journal of the American Society for Information Science
Rosetta error model for gene expression analysis
Bioinformatics
Rosetta error model for gene expression analysis
Bioinformatics
Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Bioinformatics
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.10 |
A method is described for performing sparse and stable gene selection from a number of unstable, but low cost, SVM-RFE units referred to as SVM-RFE subunits. Using a comprehensive simulation study, we show that the introduction of a consensus constraint with respect to variations in the policy of gene removal and a stability constraint with respect to perturbations in the training data can remarkably improve gene selection precision, dimensionality reduction ratio and stability of low cost SVM-RFE subunits still guaranteeing affordable computational costs. The method, which does not require the preselection of the number of selected genes, is divided into two stages. Multiple rough gene removal policies are first applied to multiple surrogate training datasets (spreading). Multiple consensus gene sets with respect to variations in the gene removal policy are then obtained and passed through a stability filter which selects the best performing gene set (despreading). Hence, while the consensus constraint performs strong dimensionality reduction at affordable computational costs, the stability constraint ensures acceptable indexes of gene selection stability and further dimensionality reduction. The method is validated on three benchmark microarray datasets.