The nature of statistical learning theory
The nature of statistical learning theory
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Effective Gene Selection Method Using Bayesian Discriminant Based Criterion and Genetic Algorithms
Journal of Signal Processing Systems
Fuzzy-Adaptive-Subspace-Iteration-Based Two-Way Clustering of Microarray Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Feature selection via Boolean independent component analysis
Information Sciences: an International Journal
Recursive Mahalanobis Separability Measure for Gene Subset Selection
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
RDCurve: A Nonparametric Method to Evaluate the Stability of Ranking Procedures
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
MOSCFRA: a multi-objective genetic approach for simultaneous clustering and gene ranking
CIBB'10 Proceedings of the 7th international conference on Computational intelligence methods for bioinformatics and biostatistics
Feature selection for support vector machines with RBF kernel
Artificial Intelligence Review
A Top-r Feature Selection Algorithm for Microarray Gene Expression Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Many methods for classification and gene selection with microarray data have been developed. These methods usually give a ranking of genes. Evaluating the statistical significance of the gene ranking is important for understanding the results and for further biological investigations, but this question has not been well addressed for machine learning methods in existing works. Here, we address this problem by formulating it in the framework of hypothesis testing and propose a solution based on resampling. The proposed r-test methods convert gene ranking results into position p-values to evaluate the significance of genes. The methods are tested on three real microarray data sets and three simulation data sets with support vector machines as the method of classification and gene selection. The obtained position p-values help to determine the number of genes to be selected and enable scientists to analyze selection results by sophisticated multivariate methods under the same statistical inference paradigm as for simple hypothesis testing methods.