Recursive Mahalanobis Separability Measure for Gene Subset Selection

Authors:
Kezhi Z. Mao;Wenyin Tang
Affiliations:
Nanyang Technological University, Singapore;Nanyang Technological University, Singapore
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2011

Citing 21
Cited 2

Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Feature selection for high-dimensional genomic microarray data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Minimum Redundancy Feature Selection from Microarray Gene Expression Data

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Variable selection using svm based criteria

The Journal of Machine Learning Research
Is cross-validation valid for small-sample microarray classification?

Bioinformatics
Gene selection using a two-level hierarchical Bayesian model

Bioinformatics
LS Bound based gene selection for DNA microarray data

Bioinformatics
A semiparametric approach for marker gene selection based on gene expression data

Bioinformatics
Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data

Bioinformatics
Significance of Gene Ranking for Classification of Microarray Samples

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data

Bioinformatics
Analysis of recursive gene selection approaches from microarray data

Bioinformatics
Gene selection using support vector machines with non-convex penalty

Bioinformatics
The ties problem resulting from counting-based error estimators and its impact on gene selection algorithms

Bioinformatics
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization

Bioinformatics
Eigengene-based linear discriminant model for tumor classification using gene expression microarray data

Bioinformatics
Bayesian variable selection for the analysis of microarray data with censored outcomes

Bioinformatics
Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Effective Gene Selection Method With Small Sample Sets Using Gradient-Based and Point Injection Techniques

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Gene selection by sequential search wrapper approaches in microarray cancer class prediction

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Challenges for future intelligent systems in biomedicine
Data mining and genetic algorithm based gene/SNP selection

Artificial Intelligence in Medicine

Gene Selection Using Iterative Feature Elimination Random Forests for Survival Outcomes

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Improving robustness of gene ranking by multi-criterion combination with novel gene importance transformation

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mahalanobis class separability measure provides an effective evaluation of the discriminative power of a feature subset, and is widely used in feature selection. However, this measure is computationally intensive or even prohibitive when it is applied to gene expression data. In this study, a recursive approach to Mahalanobis measure evaluation is proposed, with the goal of reducing computational overhead. Instead of evaluating Mahalanobis measure directly in high-dimensional space, the recursive approach evaluates the measure through successive evaluations in 2D space. Because of its recursive nature, this approach is extremely efficient when it is combined with a forward search procedure. In addition, it is noted that gene subsets selected by Mahalanobis measure tend to overfit training data and generalize unsatisfactorily on unseen test data, due to small sample size in gene expression problems. To alleviate the overfitting problem, a regularized recursive Mahalanobis measure is proposed in this study, and guidelines on determination of regularization parameters are provided. Experimental studies on five gene expression problems show that the regularized recursive Mahalanobis measure substantially outperforms the nonregularized Mahalanobis measures and the benchmark recursive feature elimination (RFE) algorithm in all five problems.