Cancer classification by gradient LDA technique using microarray gene expression data
Data & Knowledge Engineering
Gene Selection for Cancer Classification Using DCA
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Gene Expression Data Classification Using Independent Variable Group Analysis
ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks, Part II
Biological pathways as features for microarray data classification
Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
Computational Statistics & Data Analysis
Evaluating switching neural networks through artificial and real gene expression data
Artificial Intelligence in Medicine
Feature selection via Boolean independent component analysis
Information Sciences: an International Journal
Sparse Support Vector Machines with L_{p} Penalty for Biomarker Identification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Correlation-based relevancy and redundancy measures for efficient gene selection
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Variable selection via combined penalization for high-dimensional data analysis
Computational Statistics & Data Analysis
Feature selection in the Laplacian support vector machine
Computational Statistics & Data Analysis
Gene expression data classification using locally linear discriminant embedding
Computers in Biology and Medicine
Computational Statistics & Data Analysis
Recursive Mahalanobis Separability Measure for Gene Subset Selection
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Computational Statistics & Data Analysis
Review: Supervised classification and mathematical optimization
Computers and Operations Research
Support Vector Machines with L1 penalty for detecting gene-gene interactions
International Journal of Data Mining and Bioinformatics
Engineering Applications of Artificial Intelligence
Computers in Biology and Medicine
Sparse high-dimensional fractional-norm support vector machine via DC programming
Computational Statistics & Data Analysis
A fast algorithm for kernel 1-norm support vector machines
Knowledge-Based Systems
Efficient feature size reduction via predictive forward selection
Pattern Recognition
Hi-index | 3.84 |
Motivation: With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes simultaneously in one single experiment. One current difficulty in interpreting microarray data comes from their innate nature of 'high-dimensional low sample size'. Therefore, robust and accurate gene selection methods are required to identify differentially expressed group of genes across different samples, e.g. between cancerous and normal cells. Successful gene selection will help to classify different cancer types, lead to a better understanding of genetic signatures in cancers and improve treatment strategies. Although gene selection and cancer classification are two closely related problems, most existing approaches handle them separately by selecting genes prior to classification. We provide a unified procedure for simultaneous gene selection and cancer classification, achieving high accuracy in both aspects. Results: In this paper we develop a novel type of regularization in support vector machines (SVMs) to identify important genes for cancer classification. A special nonconvex penalty, called the smoothly clipped absolute deviation penalty, is imposed on the hinge loss function in the SVM. By systematically thresholding small estimates to zeros, the new procedure eliminates redundant genes automatically and yields a compact and accurate classifier. A successive quadratic algorithm is proposed to convert the non-differentiable and non-convex optimization problem into easily solved linear equation systems. The method is applied to two real datasets and has produced very promising results. Availability: MATLAB codes are available upon request from the authors. Contact: hzhang@stat.ncsu.edu Supplementary information: http://www4.stat.ncsu.edu/~hzhang/research.html